From Peter at ursus.demon.co.uk Sat Mar 1 00:14:42 1997 From: Peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:57:26 2004 Subject: Suggested List Protocol Message-ID: <4086@ursus.demon.co.uk> XML-DEV has been going for less than a week and it already has ~100 subscribers and about the same number of postings. Some of the suscribers are very well known in/to the SGML community but I expect there are some who are new to this whole venture and here a a few thoughts that may be helpful: The list is unmoderated and has no fixed agenda, so that you shouldn't be afraid of bringing up your own ideas or questions, so long as they are in some way related to how XML will be implemented. The list has no formal standing and no way of 'reaching decisions' (though it's possible that mechanisms might emerge). Any voluntary offers for summarising threads will, I'm sure, be most valuable (e.g. 'We seem to agree X, but we differ on Y - so there seems to be a role for software that is limited to X? Is this realistic?'). The discussions run alongside the WG discussions and there is a considerable overlap in membership. XML-DEV is _not_ an informal arena for discussing matters still on the WG agenda. If an issue is aired here ("do I _really_ have to do X and Y to achieve Z?") that might have a bearing on the draft(s) it won't go unnoticed :-). Discussions are archived so that you can download and read them every 2 weeks. SGML is mainly thought of as a document processing and (paper) rendering tool, but XML has the potential for many completely new applications (my own is molecular science). Therefore if you think that XML might help in flying planes, making money, sending digital odors or holography over the Internet (suggested by MIME) feel free to raise the topic. Be considerate about the volume of a posting - some of us pay for incoming mail :-). Quote those parts of previous replies that relate to your message. If you have large chunks of code, post them on http: or ftp: resources. Please also post everything in human-readable ASCII as a lot of people may not be able to manage compressed or other transformed files. Hopefully we shall move to other character sets (e.g. Unicode) in the future. P. -- Peter Murray-Rust, domestic net connection Virtual School of Molecular Sciences http://www.vsms.nottingham.ac.uk/ xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From Peter at ursus.demon.co.uk Sat Mar 1 00:14:52 1997 From: Peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:57:26 2004 Subject: TEI pointers Message-ID: <4087@ursus.demon.co.uk> I need a search tool for structured documents and would be grateful for pointers to existing tools which are free and re-usable. My target language is Java. I would intend to use the TEI syntax (does it come in different flavours?). I would also intend to use a graphically-based query if possible as well as a commandline. Has this been tried and are there any metaphors which have proved to be useful? How do most humans currently construct TEI quries? Do they learn the language and use a command line or do they get customised queries? This is sufficiently important for me that I shall need to do it myself if there is no alternative, but it seems like something that can be developed as a problem-independent module, so long as the API from the parser of other tools (e.g. GUIs) is clear. The search needs to have the flexibility to include FOREIGN, i.e. the ability to include non-XML-based methods. (In my own case it would be molecular substructure searches, which are essentially labelled subgraph matching algorithms). It should also include the SPACE facility, because this is going to be extremely important in technical documents. [The WG has suggested that the TEI syntax may be an important part of XML PhaseII, but I am not sure of the timescale for resolution. My request would be currently useful for documents prepared for the PhaseI draft and doesn't prejudge the WG deliberations.] P. -- Peter Murray-Rust, domestic net connection Virtual School of Molecular Sciences http://www.vsms.nottingham.ac.uk/ xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From Peter at ursus.demon.co.uk Sat Mar 1 01:08:02 1997 From: Peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:57:26 2004 Subject: Simple approaches to XML implementation Message-ID: <4098@ursus.demon.co.uk> The discussion on the API is extremely valuable and exciting and I'm learning a lot. There is no doubt that there are enough experts to do a first class job of building an API that will last. However, for some people who may have joined this list and who really need or want XML it may not be clear how some of this relates to more practical problems. (It really does!). A few weeks ago I got assurance from the WG that XML was not only for rocket_scientists, so if you aren't one here is a place to talk about the simple aspects. Remember that XML is 'an extremely simple' dialect of SGML _and can be used as such_. I started working with an XML-like dialect about 12 months ago, wrote my own parser and postprocessor with steam technology so it's not _essential_ to have groves, IDL, etc. though it will certainly make it much easier to develop complex applications. You may also want to build a prototype to learn what's it's about and then bolt in the more powerful parsing and processing tools later. The first thing to realise is that XML allows you to create documents that are well-formed, but need not be validated. That may be fine for many people - especially during a development stage. If you don't use EMPTY elements (e.g.
in HTML) so that all your start- and end-tags are balanced and nested correctly, and if your attributes are quoted, then that is all you need for a WF document. Example: This is a string So, are there simple tools for creating well-formed documents? Can HTML editors be extended? (Since I create a lot of my XML documents by hand, I'd be interested to have shortcuts). ------------------------------------------------------------------------ Most documents will then need some sort of processing. There are two main strategies: - event stream mode. - parse tree The event stream mode is best illustrated by HTML and the font or phrase tags. switches on italics and switches it off. is bold_on and is bold_off. If your XML document was arranged as above it would be quite easy to write code which read each line, and took appropriate action (Foo_on, Foo_off). I've been writing something this morning to do exactly that for HTML. I use Java, but there's nothing fundamental about what language you use (a year ago I used tcl/tk with CoST). So, for example, I take a _stream_ of HTML, write it to the screen, and every time I encounter a flag (tag) I take appropriate action. If the document is well-formed, the tags should nest so that the interpreting/parsing process must throw an error if an end-tag is encountered unexpectedly. The tree model is best illustrated by the containers in HTML: This is a title

That's all folks

If you look at what elements contain what others, you'll see that HTML can be thought of as a root, with two bracnches to its children (HEAD and BODY). HEAD has one child (TITLE) and BODY has one child (H1). Both TITLE and H1 contain strings (#PCDATA) which can be regarded as children Looking at structured documents as trees is extrememly powerful for searching and other manipulations. IMO HTML requires both approaches and in processing it you have to switch between them. -------------------------------------------------------------------------- In building a generic parser (such as Lark and NXP) the authors have to cover the whole range of possibilities both in the input document and the ways that it might be processed. There is, however, no need for any particular application to use the full power of XML and this might allow you to develop a simpler parser and/ or editor if you want, especially if you have need to write it for a specific platform, etc. Also, if you just 'want to get started' there are enough tools to get a feel for what XML is about. P. XML is committed to making things simple! -- Peter Murray-Rust, domestic net connection Virtual School of Molecular Sciences http://www.vsms.nottingham.ac.uk/ xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From Ingo.Macherius at tu-clausthal.de Sat Mar 1 02:16:59 1997 From: Ingo.Macherius at tu-clausthal.de (Ingo Macherius) Date: Mon Jun 7 16:57:26 2004 Subject: Simple approaches to XML implementation In-Reply-To: <4098@ursus.demon.co.uk> from "Peter Murray-Rust" at Mar 1, 97 00:28:17 am Message-ID: <199703010216.DAA00533@florix.rz.tu-clausthal.de> > Most documents will then need some sort of processing. There are two > main strategies: > - event stream mode. > - parse tree I have made up a perl5 module which models a very simple forest-like strukture, that holds Perl5 objects. The objects are created by reading nsgmls' ESIS and putting anything between certain named tags into a hash, which basically is the object content. The objects can be inserted as a root or into another object, which yields a forest-like structure. The tree-relations between objects are stored outside in a libdbm database, one per tree. It holds three tables, - id -> hashed data - id -> id of father object, or NULL - id -> ids of all sons Obviously any object must have a method giving a unique id within the forest. I think this may be called a poor-mans-grove :) I made up a simple API: INSERT INTO DB ( when opened MODE 'write' ) $db->insert_as_root ( $root ); $db->insert ( $child, 'root.id' ); $db->update ( $the_resource ); QUERY THE DB BASIC FUNCTIONS $resource = $db->fetch ( 'root.id' ); $father = $db->father ( 'child.id' ); @sons = $db->sons ( 'root.id' ); @roots = $db->roots; DERIVED FUNCTIONS (recursing all nodes below given @ids) @sons = $db->all_container_sons ( @ids ); @sons = $db->all_leaf_sons ( @ids ); @sons = $db->all_sons ( @ids ); @fathers = $db->all_fathers ( @ids ); DESTROY DB CONTENT ( when opened MODE 'write' ) $db->reset; I found this sufficient to solve small problems for which ESIS is not enough and a grove is overkill. I must admit, albeit I read most of ISO 10179, I really didn`t get the details. But what I found valuable is the choice between navigating (father/son) and id-based lookups (fetch). ++im -- Snail : Ingo Macherius // L'Aigler Platz 4 // D-38678 Clausthal-Zellerfeld Mail : Ingo.Macherius@tu-clausthal.de WWW: http://www.tu-clausthal.de/~inim/ Information!=Knowledge!=Wisdom!=Truth!=Beauty!=Love!=Music==BEST (Frank Zappa) xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From cbullard at hiwaay.net Sat Mar 1 05:05:09 1997 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 16:57:26 2004 Subject: Simple approaches to XML implementation References: <4098@ursus.demon.co.uk> Message-ID: <3317B8FF.554E@hiwaay.net> Peter Murray-Rust wrote: > > So, are there simple tools for creating well-formed documents? Can HTML > editors be extended? (Since I create a lot of my XML documents by hand, > I'd be interested to have shortcuts). Where it is well-formed, XML is very amenable to macros which ANY word processor system has these days. Just having end tags makes it easy to write editing tools in, for example, Word using the dialog editor and hidden text. Klugy, perhaps, but not out of reach and the formatting is free. > Most documents will then need some sort of processing. Sure. Does it have to be event streams? While more powerful, even cheap macros can do a lot. The idea here is, while XML is good for the Internet, simplified SGML is good for just about any thing where content markup is preferred over encapsulated objects or compiled structures. Just removing minimization, as you point out, allows for some clever work to be done with very cheap tools. Cheap tools are where the gains begin. > I've been writing something this morning to do exactly that for HTML. I use > Java, but there's nothing fundamental about what language you use (a year > ago I used tcl/tk with CoST). So, for example, I take a _stream_ of HTML, > write it to the screen, and every time I encounter a flag (tag) I take > appropriate action. If the document is well-formed, the tags should nest > so that the interpreting/parsing process must throw an error if an end-tag > is encountered unexpectedly. > > XML is committed to making things simple! XML has made SGML simple. It can even be simpler than that. len xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From richard at light.demon.co.uk Sat Mar 1 09:59:59 1997 From: richard at light.demon.co.uk (Richard Light) Date: Mon Jun 7 16:57:26 2004 Subject: XML API specification In-Reply-To: <331737EA.3A4E@hiwaay.net> Message-ID: As a postscript to my comments on the API design, I think it is important that the "event vs. tree" discussion shouldn't muddy the waters when looking at the API design. If the XML processor is seen as a 'server', and the application as a 'client', with the API in between, it is clear to me that: - the server's job is to return information about XML documents requested by the client (and this is its _only_ job!); - to do this job, the server _must_ parse the document (fully or partially) in the time-honoured sequential manner; - the client isn't so constrained, and can ask for as little or as much as it likes. For example, in an online browsing application, it is a likely requirement that the client, in resolving a TEI extended pointer or HyTime-like XML link, will request a specific element out of an XML document. Having retrieved that element, the browser may have no further use for the XML document from which it came. So the server needs to parse through the document until it hits the required element, then it can stop. Parsing through the rest of the document would just be a waste of time. (Conversely, it makes sense for the server to hold the results of the parse until it knows the client has no further use for that document. It should also be able to pick up the parse where it left off if necessary.) I don't think that it should really be the client's job to tell the server _how_ to do its parsing, except at the formal level (i.e. 'well- formed' or 'valid'). >xml-dev: A list for W3C XML Developers >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ >To unsubscribe, send to majordomo@ic.ac.uk the following message; >unsubscribe xml-dev >List coordinator, Henry Rzepa (rzepa@ic.ac.uk) > Richard Light SGML and Museum Information Consultancy richard@light.demon.co.uk 3 Midfields Walk Burgess Hill West Sussex RH15 8JA U.K. tel. (44) 1444 232067 xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From fcha at Berger-Levrault.fr Sat Mar 1 11:10:21 1997 From: fcha at Berger-Levrault.fr (F. Chahuneau - General Manager) Date: Mon Jun 7 16:57:26 2004 Subject: Fw: Trees versus event streams Message-ID: <199703011109.MAA15492@cygne.ais.berger-levrault.fr> [Peter Murray-Rust (Peter@ursus.demon.co.uk), Thu, 27 Feb 1997] > My current problem may highlight this. A CML document is highly > tree-structured and contains no mixed content, so that eventStreams don= 't > contribute much. BUT it also includes chunks of HTML where a tree > structure is quite inappropriate. If I take a Lark-based approach (or m= y > own parser) the HTML gets rendered into a tree. I am now hacking this > back into an event stream to render the hypertext. Not only does it > take more effort, but I'm sure that holding HTML as a tree has a > memory hit. Ideally when I'm parsing CML, and come to the > tag (sic) which contains , I'd like to tell the parser > 'stop parsing as a tree and just hold a hypertext string until = Peter, This kind of consideration is precisely what led us to define a *dual* programming paradigm when designing the Balise SGML processing language (http://www.balise.com). Being able to switch back-and-forth between these two useful and complementary abstractions for an SGML document (a "tree of typed nodes with attributes" vs an "ESIS or ESIS+ event stream") is, from our experience, often required when you have to express complex processing tasks on SGML documents, but still want to keep your code as concise as possible. No paradigm is inherently better than the other: it all depends what you = want to express. If you want your code to remain legible and maintainable= (i.e related in a straightforward way to the processing idea it expresses= ), then you really need both in some cases. If you are interested, this idea= is further developed in the following paper: "Event Driven or Tree Manipulation Approaches to SGML Transformation" presented at SGML'96 and = available at "http://www.balise.com/current/articles/lecluse.htm" > We *could* do this with a PI, but would have to all agree. Doing this with a PI does not seem to be the best idea, since it does not= leave a choice to the application programmer which mode she wants to use = for what, while the best choice may entirely depend on what she wants to= do. Being able to switch betwwen tree an event-stream mode on any GI even= t is what is required. For maximum generality, you also need to be able to = generate an event stream during (sub-)tree traversal, maybe not the *original* tree, but one which you have modified or created through your = application. In the world of traditional SGML applications, sheer document size is frequently an issue, so that tree mode must often be used with parsimony.= In the case of XML or HTML fragments, this problem is probably negligible= . The rationale for maintaining an "event stream" paradigm in an XML API is= , therefore, not to save memory, but simply that it might the most appropriate in some cases. _/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/ _/ Fran=E7ois CHAHUNEAU phone: [+33] 1 40 64 43 00= _/ _/ Directeur G=E9n=E9ral/General Manager = _/ _/ AIS S.A. FAX: [+33] 1 40 64 43 10 _/= _/ 15-17 rue R=E9my Dumoncel email: fcha@ais.berger-levrault.fr _= / _/ 75014, Paris, FRANCE WWW: http://www.berger-levrault.fr _/ _/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/ xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From Peter at ursus.demon.co.uk Sat Mar 1 11:16:17 1997 From: Peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:57:26 2004 Subject: Simple approaches to XML implementation Message-ID: <4111@ursus.demon.co.uk> [from PeterMR] > > Thanks Ingo, > This is very useful, because it shows that a great deal can be done quite > simply. > > In message <199703010216.DAA00533@florix.rz.tu-clausthal.de> Ingo Macherius writes: > [...] > > I have made up a perl5 module which models a very simple forest-like strukture, > > that holds Perl5 objects. The objects are created by reading nsgmls' ESIS > > I believe that ESIS has potentially a useful role in producing XML documents > from SGML documents - this was certainly my own strategy until recently. > ESIS is the normalised output from a parser (especially sgmls or NSGMLS from > James Clark - these are freely available.) It's trivial to transform > ESIS into XML, but not the other way round, since XML is richer. > > ESIS doesn't retain everything from the original document(s) and I've been > asking the experts what gets lost. My rough summary is that XML->ESIS > loses: > - comments (this matters if you want to edit the document or have > it read by humans. However comments should not be used > by machines - simply passed through) > - entities. If your document includes entities such as &chapter1; > these may be expanded and replaced by their contents. In > this way some of the structure may be less clear > - conditional markup. If you use INCLUDE and/or IGNORE then the > IGNORE'd sections won't come through and the INCLUDE'd > ones won't be marked as such > [I think that processing instructions come through OK? And that you can > determine whether an attribute value was defaulted or not?] > > If you use this simple level of markup (and _I_ do for molecular science) > then XML WF documents are equivalent to ESIS output from sgmls or nsgmls. > [Query: Are there plans for nsgmls/sgmls to output XML as an alternative > to ESIS? I expect it's straightforward]. > > > > and putting anything between certain named tags into a hash, which > > basically is the object content. The objects can be inserted as a root or into > > another object, which yields a forest-like structure. > > The tree-relations between objects are stored outside in a libdbm database, > > one per tree. It holds three tables, > > - id -> hashed data > > - id -> id of father object, or NULL > > - id -> ids of all sons > > Obviously any object must have a method giving a unique id within the forest. > > I think this may be called a poor-mans-grove :) I made up a simple API: > ^^^^^^^^^^^^^^^ > It's still very powerful, and you have recognised the importance of > structured documents. The good news is that this will all be addressed > (literally and metaphorically) in the discussion of addressing within > XML documents. The TEI project has developed a pointer scheme which > covers most aspects of structure and extends the metaphor to descendants, > ancestors, siblings and navigation by attributes and their values. I > am expecting one or more 'black boxes' to be developed which support this, > so that you don't have to write perl scripts any more. I'm waiting to hear > from another thread :-) > > [... code deleted ...] > > > > I found this sufficient to solve small problems for which ESIS is not enough > ^^^^^^^^^^ > I think you were operating _on_ the ESIS stream. You mean that simple > 'grep' or other tools weren't powerful enough? > > > and a grove is overkill. I must admit, albeit I read most of ISO 10179, I > ^^^^^^^^^^^^^^^^^ > This is one of the points at issue. Is it going to be possible to produce > software quickly, and easy enough to read and use. I'm waiting to find out:-) > > > really didn`t get the details. But what I found valuable is the choice > ^^^^^^^^^^^ > I think it's very important not to be frightened by 10179. What you have > done is very similar to what I and many others have done - devising > home-grown tools for searching structured documents. 10179 has an > implementation in Scheme (am I right?) but not in more procedural or > object-oriented languages. > > > between navigating (father/son) and id-based lookups (fetch). > > [...] > P. > -- Peter Murray-Rust, domestic net connection Virtual School of Molecular Sciences http://www.vsms.nottingham.ac.uk/ xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From fcha at Berger-Levrault.fr Sat Mar 1 18:20:36 1997 From: fcha at Berger-Levrault.fr (F. Chahuneau - General Manager) Date: Mon Jun 7 16:57:26 2004 Subject: Simple approaches to XML implementation Message-ID: <199703011819.TAA21136@cygne.ais.berger-levrault.fr> [from PMR] > > ESIS doesn't retain everything from the original document(s) and I've > been asking the experts what gets lost. In case someone wants to get even more precise information, ESIS (Element= Structure Information Set) is fully defined in annex G of document ISO/IEC/JCT1/SC18/WG8/N1035: Recommendations for a Possible Revision of I= SO 8879 (SGML). You can find an exact replication of this passage in Charles= Goldfarb's "SGML Handbook" (Clarendon Press, 1990), pp 588 to 591. > My rough summary is that > XML->ESIS loses: > - comments (this matters if you want to edit the document or have > it read by humans. However comments should not be used > by machines - simply passed through) True > - entities. If your document includes entities such as &chapter1; > these may be expanded and replaced by their contents. In > this way some of the structure may be less clear It's actually more complex than that. SGML *text* entity references, whether entities are "internal" or "external", are indeed fully expanded and you are not even notified this = in the ESIS event stream. Therefore, ESIS does not convey the "entity structure" of an SGML document. This is, by the way, irrelevant to most applications ... except for those, such as some SGML editors, whose purpo= se is seen as being able to manipulate SGML documents without arbitrarily altering their entity structure (in addition to their element structure).= External data entity references, internal SDATA and PI entity references = are signaled in the ESIS, while CDATA internal entity references are expanded without being reported. This may appear as as bizarre design choice, but there is something even more disturbing: in the case of internal SDATA entity references, only the entity "replacement value" is = passed, not the entity "name". This of one of the reasons why ESIS information, alone, does not allow to implement an "identity transformation" for SGML documents, even when you don't care about the physical decomposition of the document into several files (SGML entities)= . Note that SDATA entity disappear in XML, so that THIS PROBLEM DISAPPEARS = AS WELL! > - conditional markup. If you use INCLUDE and/or IGNORE then the > IGNORE'd sections won't come through and the INCLUDE'd > ones won't be marked as such True > [I think that processing instructions come through OK? True > And that you can determine whether an attribute value was defaulted > or not?] Unfortunately not. This information is unavailable in ESIS, and you would= need to access some "DTD information set" to be able to recover it. Besid= es attribute names and de facto values, the only side information you have i= n ESIS is when the value for an #IMPLIED attribute has not been specified. There is one more piece of information missing in ESIS, and which causes = a problem to implement an "identity transformation" for plain SGML document= s: you don't know WHICH ELEMENTS HAVE BEEN DECLARED #EMPTY in the DTD. You may know when an element has null content, but you don't know whether thi= s is because it happens to be so (optional content) or because it can't hav= e any (declared #EMPTY). Therefore, you do not know whether you should outp= ut an end tag for it or not. Again, you would need some "DTD information" to= disambiguate. Maybe not everyone realized it yet, but this *is* the one a= nd only reason why XML introduces this explicit syntax for empty elements. This, again, makes this problem disappear with XML. All in all, you can see that some design decisions in XML were precisely = motivated by the desire to make an ESIS event stream sufficient to implement an identity transformation, even with no access to DTD information. This is, of course, totally consistent with the idea that DT= Ds should not be systematically needed for processing XML fragments. Whether you work with an event stream or an abstract tree(*) is orthogona= l to this discussion: we are discussing about the *available* information, = not about the way it is represented. This does not mean that I see abstra= ct trees as useless, all the contrary (see my previous mail). I hope I helped clarify what ESIS was. (*): I use the term "asbtract tree" instead of "parse tree" to designate = the "tree of typed nodes with attributes" (you could also say "SGML objec= t tree", but this term to be somewhat overloaded these days...). From an SG= ML parser's point of view, an SGML "parse tree" would have distinct nodes fo= r start tags and end tags, which are not what you are looking for when you = want a useful representation allowing to cut-and-paste SGML elements (see= n as atomic, typed text objects with attached properties). _/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/ _/ Fran=E7ois CHAHUNEAU phone: [+33] 1 40 64 43 00= _/ _/ Directeur G=E9n=E9ral/General Manager = _/ _/ AIS S.A. FAX: [+33] 1 40 64 43 10 _/= _/ 15-17 rue R=E9my Dumoncel email: fcha@ais.berger-levrault.fr _= / _/ 75014, Paris, FRANCE WWW: http://www.berger-levrault.fr _/ _/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/ xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From peter at techno.com Sat Mar 1 18:52:55 1997 From: peter at techno.com (Peter Newcomb) Date: Mon Jun 7 16:57:26 2004 Subject: XML API specification In-Reply-To: (dgd@cs.bu.edu) Message-ID: <199703011844.NAA10478@exocomp.techno.com> > References: > Mime-Version: 1.0 > Content-Type: text/plain; charset="us-ascii" > Date: Fri, 28 Feb 1997 13:50:22 -0500 > From: dgd@cs.bu.edu (David Durand) > Sender: owner-xml-dev@ic.ac.uk > Precedence: bulk > Reply-To: dgd@cs.bu.edu (David Durand) > > At 11:53 AM -0600 2/28/97, Len Bullard wrote: > >David Durand wrote: > >> > >> I see XML-groves and XML-API as parallel and needing to be in synch. I > >> don't see either as having to depend on the other, though, and frankly, > >> given the relative penetration of groves and Java into the "global > >> developer consciousness", I don't see groves as that high a priority. > > > >If relative penetration is important, spec it in COBOL or C. > > > >This kind of argument went on in VRML and was wisely rejected. > >The commitment to a CORBA IDL is a commitment to a syntax for the spec > >and not a lot else. > > If Gavin's information is correct (and I assume it to be so) this is false. > IDL means that we get language-specific bindings for several languages > including Java and C++, simply by applyiing an automated tool. So there are > concrete technical advantages to using IDL, though we must apply those > tools for the programmers, so that I don't have to find an IDL tool to use > XML with my Java codebase. Grove schemas (property sets) can also be automatically translated/compiled to provide interface declarations in any language. We do this at TechnoTeacher to create documentation-compatible interfaces to groves stored in different ODBMSs, as well as to be able to provide access to those groves from different languages and environments. IDL, Java, and C++ can all be generated easily from the same property set. It is not necessary that developers using these APIs (in IDL, Java, C++, etc.) know about groves or property sets. However, if there is one canonical form of the API (the property set), a developer that learned his way around the API in C++ will not be confused if he is subsequently required to use the API in Java, Scheme, SQL, etc. -peter -- Peter Newcomb TechnoTeacher, Inc. 233 Spruce Avenue P.O. Box 23795 Rochester, NY 14611-4041 USA Rochester, New York 14692-3795 USA +1 716 464 8696 (home) +1 716 464 8696 (direct) +1 716 755 8698 (cell) +1 716 271 0796 (main) +1 716 529 4304 (fax) +1 716 271 0129 (fax) peter@petes-house.rochester.ny.us peter@techno.com http://www.petes-house.rochester.ny.us http://www.techno.com xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From Peter at ursus.demon.co.uk Sat Mar 1 18:54:28 1997 From: Peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:57:26 2004 Subject: MIME Message-ID: <4120@ursus.demon.co.uk> I believe that the use of MIME types for interacting with legacy data has great potential for XML. I'd welcome comments on the following ideas. Many legacy documents have been registered as MIME types, and are also capable of being represented as structured documents. This means that an XML application is capable of reading MIME types on-the-fly and converting to XML internally. (The only key requirement is that the document description is well-defined and stable so that it possible to write a DTD (or meta-DTD) for it.) I have done this for my JUMBO parser. It is able to read in ~12 MIME types (belonging to chemistry) in native form. It then converts them into a Tree object internally and as it parses the document serially, adds Nodes and Attributes where appropriate. This is isomorphic to the equivalent XML document and can be displayed in the GUI, edited, etc. and written out as XML. It is obviously capable of SD searches as well. The average user therefore sees JUMBO as a universal browser and possibly as a transformation tool (though _writing_ legacy formats from an arbitrary tree is usually difficult and information is lost). The architecture is (fairly) simple. Each MIME type requires a Java subclass of SGMLTree. As the (FORTRAN) document is read, it is poked into the nodes as appropriate. One enormous advantage of this is that the order of the data in the document doesn't cause any problems in writing the code (whereas for a conventional parser it can be a nightmare - 'have we already read this section?'). I am still amazed at how valuable this simple tree-building is. Of course, SD search techniques can then be used to add contextual information for processssing or the tree can be reordered, pruned, etc. I think it would be enormously valuable to have MIME->XML converters for helping us at the editing stage. This may be easier than we think. Reading the Java Beans spec (a few months old, so it may have changed), there are statements like: '... the [current proposal] .. is that the MIME namespace for data types shall be used by _DataFlavors_' [an interface for transferable data]. 'we want [Java beans] to be able to pretend to be an Excel document inside a Word document'. This implies that interfaces (?IDLs) will be produced for common MIME types. It should therefore be possible to obtain Word, Excel, GIF, RTF, etc. beans. The XML immplementation would then be: legacy--[bean]-->JavaInterface--[Java application]-->SDinMemory--[DTD]-->XML I haven't kept in close touch with Beans, (although I have played with the beta-release and it's very powerful for what I want to do). If we could offer Java browsers for common MIME types, with automatic viewing, editing merging and transformation into XML, it could be a very attractive way of bringing people into this arena. P. [The only downside is that the magic of XML is completely hidden from the user :-)] -- Peter Murray-Rust, domestic net connection Virtual School of Molecular Sciences http://www.vsms.nottingham.ac.uk/ xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From tbray at textuality.com Sun Mar 2 02:38:36 1997 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 16:57:26 2004 Subject: TEI pointers Message-ID: <3.0.32.19970301162710.007284c0@pop.intergate.bc.ca> At 09:58 AM 28/02/97 GMT, Peter Murray-Rust wrote: >I need a search tool for structured documents and would be >grateful for pointers to existing tools which are free and re-usable. My >target language is Java. I would intend to use the TEI syntax (does it >come in different flavours?). The only free search tool generally available is WAIS which, while not bad, is kind of difficult to administer and does not mate well with SGML. But then, there are very few *commercial* search tools that mate well with SGML either. So to get what you want, you'll probably have to write it. Since I am a tired old full-text-search guy, Lark takes fanatical care to keep track of the byte offsets of everything; so there will be at least one parser that would be useful in such an effort. The fact that Lark doesn't look at DTD's nor check conformance is not an issue at indexing time. >I would also intend to use a graphically-based query if possible as well >as a commandline. Has this been tried and are there any metaphors which >have proved to be useful? How do most humans currently construct TEI >quries? Do they learn the language and use a command line or do they >get customised queries? I've never seen a graphical search query GUI that was useful. - Tim xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From tbray at textuality.com Sun Mar 2 02:38:43 1997 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 16:57:26 2004 Subject: Simple approaches to XML implementation Message-ID: <3.0.32.19970301162825.006900ac@pop.intergate.bc.ca> At 07:19 PM 01/03/97 +0100, F. Chahuneau - General Manager wrote: >All in all, you can see that some design decisions in XML were precisely >motivated by the desire to make an ESIS event stream sufficient to >implement an identity transformation, even with no access to DTD >information. This is, of course, totally consistent with the idea that DTDs >should not be systematically needed for processing XML fragments. Whereas I agree with the rest of Francois' contribution, this paragraph is not quite right. If you change "ESIS event stream" to "Instance character stream", then it would be more correct. But in fact the SGML->SGML declaration was not one of our goals; for example, the processor is not required to tell the app about [at least] comments and SGML or the absence of an ESIS equivalent) is a big huge flaw in XML, there's still time to fix it. The SGML->SGML problem is probably a job for the XML WG. The ESIS issue is perhaps a job for this list. I personally think an API is better than an ESIS [even if the ESIS were properly defined] anyhow. Cheers, Tim Bray tbray@textuality.com http://www.textuality.com/ +1-604-708-9592 xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From Peter at ursus.demon.co.uk Sun Mar 2 10:28:52 1997 From: Peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:57:26 2004 Subject: Simple approaches to XML implementation Message-ID: <4143@ursus.demon.co.uk> In message <3.0.32.19970301162825.006900ac@pop.intergate.bc.ca> Tim Bray writes: [...] > > Whereas I agree with the rest of Francois' contribution, this paragraph > is not quite right. If you change "ESIS event stream" to "Instance > character stream", then it would be more correct. But in fact the > SGML->SGML declaration was not one of our goals; for example, the I hope I haven't muddied the waters here - SGML->SGML was not my intention either. The (possibly fuzzy) idea was that I (and probably others) are familiar with ESIS because they use sgmls and 'could this help us in our search for the ideas that go into the API'. IOW 'could we throw out things that didn't appear in ESIS?'. Don't worry if it doesn't go anywhere. > processor is not required to tell the app about [at least] comments > and merely, in a very abstract way, what the processor has to give the ^^^^^^^^^^^^^ > application. Fully agreed. I am probably showing my usual impatience in trying to resolve the 'abstract' to concrete. From a practical point of view if those people who have written parsers put their heads together and came up with a suggested API, I would look at it extremely seriously and positively. > > If either of these problems (the impossibility of SGML->SGML or the > absence of an ESIS equivalent) is a big huge flaw in XML, there's still > time to fix it. The SGML->SGML problem is probably a job for the > XML WG. The ESIS issue is perhaps a job for this list. I personally > think an API is better than an ESIS [even if the ESIS were properly > defined] anyhow. Absolutely. From my point of view what Lark provides as an API does what I want at the moment. Maybe there are things that it doesn't do that it could or should, but *I don't know about them* :-). Being a concrete person I understand those 'things' that go into and come out of current programs rather than more abstract and perhaps more powerful synoptic views. I talked last week with an important person/organisation in chemical informatics who is very excited about XML. Their main worry was that it would become too complicated. I share this concern, though I think it's also important to make sure that we don't unnecessarily limit the power of the language. However there is no reason why we shouldn't initially limit the power of the API if it makes sense. For example [as Tim says] I can do without the comments and CDATA. We've had a week to explore the boundaries of the API. The spectrum covers the use of groves and 10179 (which a lot of us don't understand) to a smaller set of more 'concrete' things which we are more at home with. If we take the more abstract approach it's going to take time and faith to come up with an API. It's probably the 'right' way, and I hope that some members of this list are trying to systematise their ideas and a way forward. I also understand and support the IDL approach if it really will produce Java, C++, etc APIs automatiicaly. *The result of this automatic conversion must, however, be understandable by humans*. Being a hacker, I like to do things quickly and suggest that we try to gather together a 'concrete' API that could be used very shortly. I would be happy to take NXP and Lark as the starting points and say that they represent the current functionality that I require. Can they converge to a common name space? Example: some people on this list use 'Element' where others use 'Node' - it's agreement at this level that I need. Similarly I need to know what classes a DTD might supply (is it ElementType or GI, or are they all bundled under Element?). And can we agree on what the totality of the information *defined in PhaseI* is? We don't lose anything by getting this off the ground quickly. It exercises the language, helps us locate resources and clarifies our thoughts. A first-generation set of tools will impress the world and maybe might be extended into more powerful systems. It also helps to build up a core of documents that act as examples. P. -- Peter Murray-Rust, domestic net connection Virtual School of Molecular Sciences http://www.vsms.nottingham.ac.uk/ xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From nmikula at edu.uni-klu.ac.at Sun Mar 2 10:29:45 1997 From: nmikula at edu.uni-klu.ac.at (Norbert Mikula) Date: Mon Jun 7 16:57:27 2004 Subject: NXP is still alive (Sorry for being late) Message-ID: To all XML freaks, I was off the XML-WG list for a few days, our daemon was down, and I missed that this list was set up. Just by chance I found the message now on c.t.s. I wonder why I didn't see it before :-( However, NXP is not dead. I will now try to sync with current status of disucussion, especially the API thread is interesting to me. An advance announcement, the next official release of NXP includes catalog support (including DELEGATE and CATALOG). You can expect it in a few days..... During the last days I was very busy to redesign my DSSSL engine YADE. ----- Best regards, Norbert H. Mikula ===================================================== = SGML, DSSSL, Intra- & Internet, AI, Java ===================================================== = mailto:nmikula@edu.uni-klu.ac.at = http://www.edu.uni-klu.ac.at/~nmikula ===================================================== xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From Ingo.Macherius at tu-clausthal.de Sun Mar 2 10:56:19 1997 From: Ingo.Macherius at tu-clausthal.de (Ingo Macherius) Date: Mon Jun 7 16:57:27 2004 Subject: Simple approaches to XML implementation In-Reply-To: <4111@ursus.demon.co.uk> from "Peter Murray-Rust" at Mar 1, 97 11:13:37 am Message-ID: <199703021056.LAA02919@florix.rz.tu-clausthal.de> Peter, > > > I have made up a perl5 module which models a very simple forest-like strukture, > > > that holds Perl5 objects. The objects are created by reading nsgmls' ESIS [...] > > > I think this may be called a poor-mans-grove :) I made up a simple API: > > ^^^^^^^^^^^^^^^ > > It's still very powerful, and you have recognised the importance of > > structured documents. The good news is that this will all be addressed > > (literally and metaphorically) in the discussion of addressing within > > XML documents. The TEI project has developed a pointer scheme which > > covers most aspects of structure and extends the metaphor to descendants, > > ancestors, siblings and navigation by attributes and their values. I > > am expecting one or more 'black boxes' to be developed which support this, > > so that you don't have to write perl scripts any more. I'm waiting to hear > > from another thread :-) I wrote the Perl interface because I needed an access to SGML information which is fast enough for CGI. So I maintain the information base as SGML doc and "render" it to my homegrown OODB described in the last mail. The information is updated only once a week or so, so this is a sufficient method. Jade is too slow, considering the fork the http has to do to start DSSSL processing. I'd love to use SDQL ! > > > I found this sufficient to solve small problems for which ESIS is not enough > > ^^^^^^^^^^ > > I think you were operating _on_ the ESIS stream. You mean that simple > > 'grep' or other tools weren't powerful enough? I need a persistent representation of structured data. The information would fit into a RDBMS, so the job could easily be done with mSQL. But I wanted to find out, if SGML would works, too. It does :) Yes, I operate on an ESIS stream while *rendering* the doc to my OODB, but afterwards I operate only on the DB for speed's sake. I'd prefer to do this on a persistent grove, but implementation would be far more complicated than my little perl hack :) > > > really didn`t get the details. But what I found valuable is the choice > > ^^^^^^^^^^^ > > I think it's very important not to be frightened by 10179. What you have > > done is very similar to what I and many others have done - devising > > home-grown tools for searching structured documents. 10179 has an > > implementation in Scheme (am I right?) but not in more procedural or > > object-oriented languages. Hm. I always thought DSSSL is a dialect of Scheme, so this is not q question of implementation. IMHO it *must* be implemented as scheme. It's allowed to define other languages that map on corresponding DSSSL/scheme statements, which have to be submitted to a DSSSL engine. But an engine that calls itself a DSSSL engine must have a (restricted) scheme engine inside. Correct me if I am wrong ! I'd be happy to hear about anyone writing a book on DSSSL. I read all examples I could get from jjc and Jon Bosak, but a structured introduction would help to convince other people, that do not have the time to read sources. > > > between navigating (father/son) and id-based lookups (fetch). This is very important for me, because sometimes I *know* which element I need, because I get the ID from elsewhere. Any API to XML should offer both, a navigating query and a mere GOTO. ++im -- Snail : Ingo Macherius // L'Aigler Platz 4 // D-38678 Clausthal-Zellerfeld Mail : Ingo.Macherius@tu-clausthal.de WWW: http://www.tu-clausthal.de/~inim/ Information!=Knowledge!=Wisdom!=Truth!=Beauty!=Love!=Music==BEST (Frank Zappa) xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From Peter at ursus.demon.co.uk Sun Mar 2 11:46:49 1997 From: Peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:57:27 2004 Subject: Simple approaches to XML implementation Message-ID: <4149@ursus.demon.co.uk> In message <199703021056.LAA02919@florix.rz.tu-clausthal.de> Ingo Macherius writes: [...] > > I wrote the Perl interface because I needed an access to SGML information which > is fast enough for CGI. So I maintain the information base as SGML doc and > "render" it to my homegrown OODB described in the last mail. The information One of the selling points of XML should be that it maps directly onto OODBs. I don't know much about this myself, but I would assume that since the OODB supports persistence then this is a way of supporting persistence in XML applications. (Hope this isn't drivel). > is updated only once a week or so, so this is a sufficient method. > Jade is too slow, considering the fork the http has to do to start > DSSSL processing. I'd love to use SDQL ! See below. > > > > > I found this sufficient to solve small problems for which ESIS is not enough > > > ^^^^^^^^^^ > > > I think you were operating _on_ the ESIS stream. You mean that simple > > > 'grep' or other tools weren't powerful enough? > > I need a persistent representation of structured data. The information would > fit into a RDBMS, so the job could easily be done with mSQL. But I wanted to > find out, if SGML would works, too. It does :) > Yes, I operate on an ESIS stream while *rendering* the doc to my OODB, but > afterwards I operate only on the DB for speed's sake. I'd prefer to do this > on a persistent grove, but implementation would be far more complicated than > my little perl hack :) Again - I suspect that getting the XML/OO interface correct means that we get gains from both sides. > [...] > > > > > between navigating (father/son) and id-based lookups (fetch). > This is very important for me, because sometimes I *know* which element I > need, because I get the ID from elsewhere. Any API to XML should offer both, > a navigating query and a mere GOTO. My simple understanding is that TEI pointers offer all of this. They have an (SGML) ID which is your GOTO and a sufficiently powerful navigation system for (my) needs. The description is in the PhaseII documentation but has not yet been discussed by the WG or ERB. I am hoping that they come up with something very similar to TEI, since I can understand it! Here's a simple example from the draft: ID (a23) DESCENDANT (2 TERM LANG DE) matches the second TERM element with a LANG (attribute) of DE occurring within the element with an ID (goto) of a23. TEI should be able to deal with everything that you have so far specified. I have asked very recently whether there is an implementation of this and maybe part of this topic will move to the TEI thread. I suspect that a key question will be performance and therefore it may be important to decide whether a document is indexed when parsed (suggestions?) or whether intermediate search results are cached. Without more experience I won't speculate. My own primitive system caches results (e.g. once a question is asked about IDs, it will remeber the ID values in a hashtable for future queries). To the experts: I have read the core SDQL and it seems to have a different syntax from TEI. So are the two going to be used together? What does SDQL offer that TEI doesn't (to a simple web hacker?). P. -- Peter Murray-Rust, domestic net connection Virtual School of Molecular Sciences http://www.vsms.nottingham.ac.uk/ xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From Peter at ursus.demon.co.uk Sun Mar 2 11:46:56 1997 From: Peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:57:27 2004 Subject: NXP Message-ID: <4150@ursus.demon.co.uk> [I have changed the subject to 'NXP' so that if this parser is specifically discussed in future there is a placeholder. More general replies should go the the thread on XML API or elsewhere.] Welcome Norbert, You have an honoured position on this list having written a very impressive piece of code to get us started. In message Norbert Mikula writes: [...] > > However, NXP is not dead. I will now try to sync with ^^^^^^^^^^^^^^^ :-) Runs fine on my machine! > current status of disucussion, especially the API > thread is interesting to me. I have recently posted suggesting that we should try to consolidate on a simple API to get us started. My own development depends to a significant extent on what API I can use after parsing. I want it to be very clearly separated, because I see a parser as being a 'bolt-in' tool rather than a component which drives the rest of the application. (Maybe this isn't possible, but it's worth trying for). If you have missed any of the postings here they are all hypermailed (see the list .sig) P. -- Peter Murray-Rust, domestic net connection Virtual School of Molecular Sciences http://www.vsms.nottingham.ac.uk/ xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From Peter at ursus.demon.co.uk Sun Mar 2 13:25:22 1997 From: Peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:57:27 2004 Subject: Listrivia Message-ID: <4155@ursus.demon.co.uk> I shall be away next week as guest of the German Chemical Society in Wurzburg, where I shall be giving a lecture on "Structured Documents and Hyperlinking in Chemistry" The URL for the conference is at: http://schiele.organik.uni-erlangen.de/cic/IuK97/ and my abstract for the Wednesday session is visible under that. I shall be talking about CML and XML and I hope to give a demonstration. I shall not be able to mail to this list for a week [sighs of relief]. Henry Rzepa will continue to manage the technical aspects of the list. [BTW Henry does all the hard work with managing e-mail addresses which doesn't automatically come to my notice and I am very grateful to him. He may mail some recommendations to comp.text.sgml about e-mail addresses since these can cause confusion.] P. -- Peter Murray-Rust, domestic net connection Virtual School of Molecular Sciences http://www.vsms.nottingham.ac.uk/ xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From tms at ansa.co.uk Sun Mar 2 15:41:29 1997 From: tms at ansa.co.uk (Toby Speight) Date: Mon Jun 7 16:57:27 2004 Subject: Simple approaches to XML implementation In-Reply-To: Peter@ursus.demon.co.uk's message of Sun, 02 Mar 1997 09:41:21 GMT References: <4143@ursus.demon.co.uk> Message-ID: Peter> Peter Murray-Rust Tim> Tim Bray I'm not signing this, since one of our mailhosts is badly corrupting text. >>>>> In article <3.0.32.19970301162825.006900ac@pop.intergate.bc.ca>, >>>>> Tim wrote: Tim> Whereas I agree with the rest of Francois' contribution, this Tim> paragraph is not quite right. If you change "ESIS event stream" Tim> to "Instance character stream", then it would be more correct. Tim> But in fact the SGML-> SGML declaration was not one of our goals; >>>>> In article <4143@ursus.demon.co.uk>, Peter wrote: Peter> I hope I haven't muddied the waters here - SGML->SGML was not Peter> my intention either. The (possibly fuzzy) idea was that I (and Peter> probably others) are familiar with ESIS because they use sgmls Peter> and 'could this help us in our search for the ideas that go Peter> into the API'. I think it's good to have some of these conceptual anchors around - it helps us know when we're talking about the same things. Tim> ... for example, the processor is not required to tell the app Tim> about [at least] comments and says *nothing* about the ESIS, merely, in a very abstract way, Tim> what the processor has to give the application. Tim> If either of these problems (the impossibility of SGML->SGML or Tim> the absence of an ESIS equivalent) is a big huge flaw in XML, Tim> there's still time to fix it. The SGML->SGML problem is probably Tim> a job for the XML WG. The ESIS issue is perhaps a job for this Tim> list. I personally think an API is better than an ESIS [even if Tim> the ESIS were properly defined] anyhow. Peter> ... there is no reason why we shouldn't initially limit the Peter> power of the API if it makes sense. For example [as Tim says] Peter> I can do without the comments and CDATA. IMO, the application should be able to decide (preferably at compile time) whether it is interested in comments etc. We want to enable the creation of small, efficient applications as well as highly capable ones; I suggest an approach of providing lots at the parser, but providing filtering down to ESIS by default. My mental model has two kinds of application: those that take a well- formed document and present it to the user, and those that take a valid document and allow the user to edit and save it. The ability to perform the identity transform is obviously a requirement for the latter, whereas an ESIS stream may be sufficient for the former. What exactly constitutes an identity transform is not entirely clear cut, though. Is it okay to expand internal CDATA entities? Do we need to preserve record-end information? (We might want to do this if we will be running "diff" on the output - for version control systems, perhaps). I'd like to see a parser come with a base class for building an application's event-stream handler, that simply throws away most events - the application writer overrides the methods he is interested in. Some of the events, however would have other actions. Two examples: 1. the default handler for #PCDATA would expand internal CDATA entity references and splice in marked sections, and pass the result to the handler for ESIS "data". 2. the default handler for #EMPTY elements would call the handler for start-tag, then the one for end-tag. I'm looking at the "Esis" interface in NXP, and I think it could be modified to act as such a base class. Comments from Norbert would be appreciated. The use of the base-class methods as a filter from the XML event stream to an ESIS stream means that an application could be written[*] that acts on ESIS events, but could selectively choose events to handle from a superset of ESIS - could we agree on a suitable superset? [*] or an existing application quickly ported - this makes a convincing argument to me :-) [In my approach, we could even change the superset without affecting those applications that run off the subset - and simply extending the superset shouldn't affect any existing application, because the base class would simply throw away the new events] Peter> We don't lose anything by getting txis off the ground quickly. Peter> It exercises the language, helps us locate resources and Peteb> clarifies our thoughts. A first-generation set of tools will Peter> impress the world and maybe might be extended into more Peter> powerful systems. It also helps to build up a core of Peter> documents that act as examples. I'm not comment on this; I just quoted it because I agree :-) xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From fcha at Berger-Levrault.fr Sun Mar 2 15:46:52 1997 From: fcha at Berger-Levrault.fr (F. Chahuneau - AIS) Date: Mon Jun 7 16:57:27 2004 Subject: Simple approaches to XML implementation Message-ID: <199703021545.QAA12767@cygne.ais.berger-levrault.fr> [F. Chahuneau, 01 Mar 1997] > >All in all, you can see that some design decisions in XML were precise= ly > >motivated by the desire to make an ESIS event stream sufficient to > >implement an identity transformation, even with no access to DTD > >information. This is, of course, totally consistent with the idea that= > > DTDs should not be systematically needed for processing XML fragments. > [Tim Bray Sat, 01 Mar 1997] > Whereas I agree with the rest of Francois' contribution, this paragraph= > is not quite right. If you change "ESIS event stream" to "Instance > character stream", then it would be more correct. But in fact the > SGML->SGML declaration was not one of our goals; for example, the > processor is not required to tell the app about [at least] comments > and merely, in a very abstract way, what the processor has to give the > application. > (Hello, Tim) I suspect we actually agree more than it seems, but I should not have use= d the term "identity transformation" without defining it first. My implicit= definition was very minimal: being able to generate on the output side a= n instance which parses according to the same DTD as the instance on the input side. As you know, not being able to detect EMPTY elements defeats = this purpose, whereas not knowing whether an attribute was defaulted or not, though it might be considered as an information loss, is not a probl= em according to this definition. As many other SGML practicioners, I've never considered the fact that CDA= TA marked sections (or comments) would not be notified to be as important in= practice as the previous problem. (From an abtsract point of view, marked= sections can be seen as a packing scheme allowing to deliver several "abstract trees" interleaved in a single file... and therefore are not representable on a single abstract tree. (Maybe this means I have been thinking in an XML-oriented framework even before it was formalized... which is probably why I supported the idea so readily!) Anyway, I will not pursue (at least not here) this dicusssion which probably deviates from the main purpose of this list. I provided some precise information about ESIS because the subject was raised, but it is= clear that its only utility, in this discussion, is to serve as an exampl= e of a normalised "event-stream based" interface between a parser an application, which could inspire more carefully designed interfaces in th= e same style. A tool such as Balise, in its communication with SP, requires= more than ESIS... The only important message, in all what I said in my previous e-mails, is= my conviction that a useful API should provide both event-stream *and* tree-manipulation paradigms. It is true, to some extent, that you can bui= ld one from the other, and that this could be done inside the application. B= ut implementing this duality *below* the API level offers big advantages, bo= th for maximum expressive power/flexibility ... and to avoid everybody to reinvent the wheel. [Peter Murray-Rust Sun, 02 Mar 1997 11:32:40] > My own development depends to a significant extent on what API I can > use after parsing.I want it to be very clearly separated, because I > see a parser as being a 'bolt-in' tool rather than a component which > drives the rest of the application. (Maybe this isn't possible, but it'= s > worth trying for). This is indeed possible and, to my opinion, it's even required. This is = how Balise is implemented, both with respect to both SP (the SGML parser = module) and to its new XML "well-formed document scanner" module. The parsing module should be able to operate in "slave mode", and this should= be reflected at the API level (i.e. you need a primitive to trigger the parsing of an SGML document or an XML fragment). This also means you need= the parser to be reentrant. That was not the case with sgmls, but it was = fixed with SP, and should not be too hard a requirement for the forthcomi= ng generation of XML parsers/scanners. _/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_= / _/ Fran=E7ois CHAHUNEAU phone: [+33] 1 40 64 43 00 = _/ _/ Directeur G=E9n=E9ral/General Manager = _/ _/ AIS S.A. FAX: [+33] 1 40 64 43 10 _/ _/ 15-17 rue R=E9my Dumoncel email: fcha@ais.berger-levrault.fr _/ _/ 75014, Paris, FRANCE WWW: http://www.berger-levrault.fr _/ _/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/ xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From Peter at ursus.demon.co.uk Sun Mar 2 16:13:19 1997 From: Peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:57:27 2004 Subject: Simple approaches to XML implementation Message-ID: <4161@ursus.demon.co.uk> Thank you Francois, In message <199703021545.QAA12767@cygne.ais.berger-levrault.fr> "F. Chahuneau - AIS" writes: > [F. Chahuneau, 01 Mar 1997] [...] > [Peter Murray-Rust Sun, 02 Mar 1997 11:32:40] > > > My own development depends to a significant extent on what API I can > > use after parsing.I want it to be very clearly separated, because I > > see a parser as being a 'bolt-in' tool rather than a component which > > drives the rest of the application. (Maybe this isn't possible, but it'> s > > worth trying for). > > This is indeed possible and, to my opinion, it's even required. This is > > how Balise is implemented, both with respect to both SP (the SGML parser > > module) and to its new XML "well-formed document scanner" module. The > parsing module should be able to operate in "slave mode", and this should> > be reflected at the API level (i.e. you need a primitive to trigger the > parsing of an SGML document or an XML fragment). This also means you need> > the parser to be reentrant. That was not the case with sgmls, but it was > > fixed with SP, and should not be too hard a requirement for the forthcomi> ng > generation of XML parsers/scanners. This is extremely valuable. We appreciate that many groups will be developing commercial applications that cannot be described in detail, but it's very useful to know of the general strategies that are being or have been developed successfully. Parsing is clearly a process where it should be clear to the user what the tool does and we shall need to be able to agree on terminology. I imagine that some of the information above would form part of a technical manual or developer's kit and that it should mean the same things to everyone. There are terms that are not part of the XML spec, but are reasonably associated with or included in an API because of the different ways of processing XML documents. P. -- Peter Murray-Rust, domestic net connection Virtual School of Molecular Sciences http://www.vsms.nottingham.ac.uk/ xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From Peter at ursus.demon.co.uk Sun Mar 2 16:47:16 1997 From: Peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:57:27 2004 Subject: ERB work on 3.* (Linking Elements) issues Message-ID: <4163@ursus.demon.co.uk> Tim, I'd like you and the ERB to know how much I appreciate the work the ERB is doing, and also that I think it's a very effective process. Personally I'm happy to work with whatever comes out - I trust the ERB to come out with the most workable solution that the associated brainpower and experience can muster. [I think it's a credit that on xml-dev (which is discussing how to implement PhaseI) no-one so far has suggested that the spec got things wrong, or is ambiguously worded, or otherwise unimplementable.] In message <3.0.32.19970301183622.00b3fb54@pop.intergate.bc.ca> Tim Bray writes: > The ERB has now put two meetings work in on this set of issues and is > nowhere near done. Not surprising, given the importance of the issues. > One of the factors holding us back a bit has been the fact that the > discussion in the WG on the 3.* issues has been lacking in both volume > and depth. Reasons for this might be (a) that the WG is tired (the > ERB is), (b) that the WG is busy on other things, and (c) that the WG > has substantially less experience in these issues than in those that > came up in the XML language discussion. I cannot answer for anyone else, but I am (c). [I think it's also going to be a problem in PhaseIII.] I shall (I hope) have something to say about addressing in PhaseII (I assume that's still to come). >From my own perspective as a web hacker, I can probably hack solutions to most of the proposals so far, so what matters is whether: (a) people outside the WG, outside SGML, will understand the result. (b) any decision is more constraining than any other. At present I am implementing the simplest approaches (HREF-like and IMG-like) in JUMBO and can probably manage your next lot (with a struggle, and not very efficiently, but that's not the point). As long as the rules are clear, whether we have link information in attributes, GIs, contents or the whole lot is probably manageable. It's more a question of whether confusion will result. [...] As I mentioned on xml-dev I was talking to an important organisation in our community who were very keen on XML, but 'hoped [the ERB/EG] didn't make it too complicated'. In a sense, therefore, there are already two levels of indirection - people like me have to understand it and then carry the message to a wider community. If _they_ in term have to educate staff, the system needs to be fairly self-explanatory. Where possible, therefore, I will cast a meta-vote in favour of the 'most obviously understandable solution (without prior SGML/HyTime knowledge)'. To this end, any short example documents illustrating your conclusions so far would be extremely valuable. Essentially: 'This is what we are suggesting: can you (a) understand what it is meant to do? (b) think it can be implemented? (c) do everything that you want to do? (even if some solutions creak a bit).' We could then try to feed back on these (more concrete) documents. P. -- Peter Murray-Rust, domestic net connection Virtual School of Molecular Sciences http://www.vsms.nottingham.ac.uk/ xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From bosak at atlantic-83.Eng.Sun.COM Mon Mar 3 01:16:54 1997 From: bosak at atlantic-83.Eng.Sun.COM (Jon Bosak) Date: Mon Jun 7 16:57:27 2004 Subject: Call for WWW6 demos Message-ID: <199703030115.RAA27815@boethius.eng.sun.com> As most of you know, the World Wide Web conference in Santa Clara (April 7-11) is a major event for XML and DSSSL. The XML-link draft spec will be announced, and the Web community will for the first time be made aware that alternative delivery strategies for structured documents are becoming a possibility. In addition to a report on the SGML Activity in the W3C track during the conference itself, there is going to be a full-day workshop on structured document delivery on the Monday beginning the conference week and a full-day session on XML and DSSSL during Developer's Day on the Friday of that week. Another basic component of this introduction to structured document alternatives will be a forum in which experimenters and companies taking an early lead in XML or DSSSL can demonstrate their products and projects. Early in conference planning I arranged with the WWW6 coordinators to hold open Thursday evening for a session that would showcase current efforts for the relatively small but important subset of conference attendees actively involved in Web development. Now it's time to see just who intends to be there and what they will need in the way of facilities so that an appropriate room assignment can be made. If you or your organization have an XML- or DSSSL-related product or technology to present at the conference -- an XML parser, an XML editor, a DSSSL browser, or what-have-you -- please send me a message with the following information: Name of organization (if any) Contact information for responsible person Description of product or technology for my information Description of product or technology for public announcement (if different from above -- please be clear about what can and can't be stated publicly) Facilities needed to demonstrate the technology (e.g., lcd projector, Internet connection) Whether you or a person from your organization will be demonstrating the technology or whether you want someone else to demonstrate it in your absence (I can't make any guarantees in the latter case, but I'll see what can be arranged) Try to have this information to me by Sunday evening, March 9, so that I can make room arrangements and public announcements. If you contact me after that time, I will make every effort to accommodate you, but it may not be possible to fit you into the schedule. Unless you tell me otherwise, I will assume that it is OK to include a description of your product or project in public announcements and on the conference Web site. If you are among the half-dozen people who indicated to me earlier that you intended to have something to demonstrate, please send me a message anyway to confirm your intention and provide the information I need in a uniform format. For maximum coverage I am posting this message to both the sgml-wg and xml-dev lists. Please use the xml-dev list for followups, if any. I will post a summary of what I've received to the xml-dev list at the beginning of next week. Jon ---------------------------------------------------------------------- Jon Bosak, Online Information Technology Architect, Sun Microsystems ---------------------------------------------------------------------- 2550 Garcia Ave., MPK17-101, | Best is he that inuents, Mountain View, California 94043 | the next he that followes Davenport Group::SGML Open::ANSI X3V1 | forth and eekes out a good ::ISO/IEC JTC1/SC18/WG8::W3C SGML ERB | inuention. ---------------------------------------------------------------------- xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From gtn at ebt.com Mon Mar 3 14:05:31 1997 From: gtn at ebt.com (Gavin Nicol) Date: Mon Jun 7 16:57:27 2004 Subject: XML API specification In-Reply-To: (dgd@cs.bu.edu) Message-ID: <199703031402.JAA28079@nathaniel.ebt> >>This kind of argument went on in VRML and was wisely rejected. >>The commitment to a CORBA IDL is a commitment to a syntax for the spec >>and not a lot else. > >If Gavin's information is correct (and I assume it to be so) this is false. >IDL means that we get language-specific bindings for several languages >including Java and C++, simply by applyiing an automated tool. So there are >concrete technical advantages to using IDL, though we must apply those >tools for the programmers, so that I don't have to find an IDL tool to use >XML with my Java codebase. JAVA, C, C++, ADA (and if you use ILU, a whole lot more) >> The commitment to JAVA for implementation >>is only a commitment to a slow language. > >Again, verifiably false. There is no reason that native-code Java compilers >cannot exist. Languages aren't slow -- implementations are. Something you >learn sometime in your first 2 years of college... There is already an i86 native code compiler, and I hear that the FSF is working on incorporating JAVA into GCC. xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From richard at light.demon.co.uk Mon Mar 3 14:12:11 1997 From: richard at light.demon.co.uk (Richard Light) Date: Mon Jun 7 16:57:27 2004 Subject: XML API spec In-Reply-To: <4109@ursus.demon.co.uk> Message-ID: In message <4109@ursus.demon.co.uk>, Peter Murray-Rust writes >In message <5lx6vCAzGtFzEwII@light.demon.co.uk> Richard Light writes: >[...] >> Operation >> --------- >> Presumably the XML processor is a 'slave' to the application, and >> only does what it's told to. > >I think that's right. OTOH it may be that it's possible to build a parser >that only does one thing and that the application decides what use to make >of the output. sgmls is rather like this - you either get the ESIS stream >or nothing (except error messages :-). I think that the "ESIS stream or nothing" case can (and should) still be seen in API terms. Essentially such a parser can have a very simple API with three commands: "open this XML document/fragment" "deliver me the whole tree structure in ESIS format" "close this XML document/fragment" Looking at it this way, I'm confident that the existing implementations can be developed to have an 'API', and we'll be on our way. The advantage of this approach is that it is easy to extend the command set to match the capabilities of the parser. For example, if the parser becomes capable of deciding whether or not to include marked sections or comments in its "ESIS" stream, then the "deliver me the whole tree structure in ESIS format" command can be refined to have arguments that determine which features the user wants delivered. (And in fact, this is exactly what a 'grove plan' is (as I understand it). "Give me elements, attributes, external entities, data content." It's a pretty obvious concept: shame about the air of mystery around it!) The other big issue to resolve at this stage is what in format the parser ("XML processor") should deliver information to the application. And that leads us (me, anyway!) to consider the "division of labour" issue. ESIS gives us a rather strange precedent, which perhaps think we shouldn't take too much as gospel, even if we are all very used to it. In the most general terms, the parser ("XML processor") has to deliver information about the XML document to the application. In ESIS a sequence of textually-represented tokens indicate parsing events from which an application can deduce the tree structure that is the XML document: element start, element end, data content, new line in source file, e.g.: (SOURCEDESC AID IMPLIED AN IMPLIED ALANG IMPLIED AREND IMPLIED ATEIFORM CDATA p (P -Generated from ASCII file by an OmniMark script )P )SOURCEDESC L8 This approach means that the application has to stay on its toes if it wants to get the structure right. And, fundamentally, it means that the _application_ has the job of building the tree, whether it wants/needs to or not. In the SGML world, this is perhaps a reasonable division of labour, since the parser has already done a lot of work for the application by inferring omitted end-tags, shortrefs, etc, etc. However, the whole point of XML is to _remove_ all of this complexity. I would therefore argue that in the XML world it is reasonable to ask the parser ("XML processor") to do a bit more: to "build the tree" and then let the application cherry-pick the bits it requires. Having resolved that (which we havn't - comments please!), we still have the delivery issue. I think a valuable aspect of the ESIS aproach is that the output is textual in nature. In principle, we could have a sequence of (binary) "objects" with "properties", splurging out of the parser, but to do so would in my view limit the usefulness of the output to a specific application environment. Bad thing! So, what does our "textual" output look like? As I said above, ESIS is a rather strange precedent. It uses a set of conventions all of its own: - a newline for certain events (but not for all); - first character of the line is an ESIS-specific code for the event type ("(" = start-tag; "-" = data content, etc.); - character entities represented (uselessly) by their mapping; - etc. A much simpler approach, which I _think_ is what would happen in a DSSSL-style transformation, is for the parser simply to output tidied-up XML. In which case, you might ask, what the heck is the parser doing for us? To which I would reply "about the same as what ESIS is doing for you!" The value of the parser will be apparent once it is able to filter out and deliver: - exactly those properties of the XML that the appplication is interested in; - any required subtree from the full document >> View of the XML document >> ------------------------ >> What does the application 'see' of the XML document it has asked the >> XML processor to open? The spec implies that it should have pretty >> direct access, e.g.: >> >> "An XML processor must inform the application of the length of >> comments if they are not passed through, to enable the application >> to keep track of the correct location of objects in the XML >> document." >> >> This fills me with gloom. Shouldn't there be a level of abstraction > ^^^^^^^^^^ >It would fill me with gloom _if I had to write the parser_ :-). If someone >else has done this, and didn't mind doing it, and if the result made it >trivial to discard comments (or other information) then it's not a problem. Sorry, Peter, I didn't make my point clearly. The "gloom" related to the phrase: "... to enable the application to keep track of the correct location of objects in the XML document". In my view of things, the application should _never_ have, or need, direct access to the actual XML document. It should get _everything_ it needs through the API. In the context of an editing application, where one might think the application needed to "poke" new or changed data directly into the XML document, I would argue that the parser should be performing a read-only operation on the source XML. If an editor is letting the user make changes, it is on an _in-memory copy_ of the source document (which clearly, as several of us have noted, needs to be a complete copy). When the user of the XML editing application decides to save their edited result, they will be overwriting the source XML document on disc with their in-memory copy, just as you do with any word processor. There is no need for the parser ("XML processor") to be involved in this stage of the process at all. Richard Light. xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From gtn at ebt.com Mon Mar 3 14:38:44 1997 From: gtn at ebt.com (Gavin Nicol) Date: Mon Jun 7 16:57:28 2004 Subject: Simple approaches to XML implementation In-Reply-To: <4143@ursus.demon.co.uk> (Peter@ursus.demon.co.uk) Message-ID: <199703031436.JAA28091@nathaniel.ebt> >I also understand and support the IDL approach if it really will produce >Java, C++, etc APIs automatiicaly. *The result of this automatic >conversion must, however, be understandable by humans*. The C++/JAVA bindings tend to be quite eaasy to read (basically just proxying objects). xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From cbullard at hiwaay.net Mon Mar 3 14:43:00 1997 From: cbullard at hiwaay.net (Len Bullard) Date: Mon Jun 7 16:57:28 2004 Subject: XML API specification References: <199703031402.JAA28079@nathaniel.ebt> Message-ID: <331AE0C4.1A1E@hiwaay.net> Gavin Nicol wrote: > > >>This kind of argument went on in VRML and was wisely rejected. > >>The commitment to a CORBA IDL is a commitment to a syntax for the spec > >>and not a lot else. > > > >If Gavin's information is correct (and I assume it to be so) this is false. > >IDL means that we get language-specific bindings for several languages > >including Java and C++, simply by applyiing an automated tool. So there are > >concrete technical advantages to using IDL, though we must apply those > >tools for the programmers, so that I don't have to find an IDL tool to use > >XML with my Java codebase. > > JAVA, C, C++, ADA (and if you use ILU, a whole lot more) Again, "What it means to the spec". Available tools are the next level. Groves to IDL to Whatever is still the food chain. Committing directly to Java is what is wrong in the previous posted suggestion. As David says, "we are in raging agreement". Unless we leave the API adaptible to other languages, we lose too many well-known and practiced optimization advantages. So, C, C++, yes even ADA, are still possibilities. > >> The commitment to JAVA for implementation > >>is only a commitment to a slow language. > > > >Again, verifiably false. There is no reason that native-code Java compilers > >cannot exist. Languages aren't slow -- implementations are. Something you > >learn sometime in your first 2 years of college... > > There is already an i86 native code compiler, and I hear that the > FSF is working on incorporating JAVA into GCC. Glad to hear it. Have you ever read the FAR and its regulations for using commercial software? These don't matter to academic development efforts, but to the commercial software business they are of some importance. So, forgive me if I keep pushing toward the commercial requirements. Java is fine. FSF is food for the hungry. There are alternatives that must be considered. IDL looks to be the best candidate for the implementors. I think a grove definition provides good spec language and makes it easier to align XML with the Technical Corrigendums from WG8. Let each party read the verbiage that works best for them. len xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From richard at light.demon.co.uk Mon Mar 3 16:07:36 1997 From: richard at light.demon.co.uk (Richard Light) Date: Mon Jun 7 16:57:28 2004 Subject: Simple approaches to XML implementation In-Reply-To: Message-ID: In message , Toby Speight writes >IMO, the application should be able to decide (preferably at compile >time) whether it is interested in comments etc. We want to enable the >creation of small, efficient applications as well as highly capable >ones; I suggest an approach of providing lots at the parser, but >providing filtering down to ESIS by default. Bear in mind that, no matter how much or how little the application is interested in, the parser has to chew its way sequentially through the whole darned XML document. In a way, the only efficiency issue is how much it "remembers" en route. And whether it can stop because it knows it has found the element, entity reference, or whatever, that the application told it to sniff out. Richard Light SGML and Museum Information Consultancy richard@light.demon.co.uk 3 Midfields Walk Burgess Hill West Sussex RH15 8JA U.K. tel. (44) 1444 232067 xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From gtn at ebt.com Mon Mar 3 17:35:26 1997 From: gtn at ebt.com (Gavin Nicol) Date: Mon Jun 7 16:57:28 2004 Subject: Simple approaches to XML implementation In-Reply-To: (message from Richard Light on Mon, 3 Mar 1997 11:09:30 +0000) Message-ID: <199703031732.MAA28163@nathaniel.ebt> >Bear in mind that, no matter how much or how little the application is >interested in, the parser has to chew its way sequentially through the >whole darned XML document. In a way, the only efficiency issue is how >much it "remembers" en route. And whether it can stop because it knows >it has found the element, entity reference, or whatever, that the >application told it to sniff out. At least, unlike SGML, and XML parser can delay entity resolution. xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From digitome at iol.ie Mon Mar 3 19:32:20 1997 From: digitome at iol.ie (Digitome Ltd.) Date: Mon Jun 7 16:57:28 2004 Subject: Simple approaches to XML implementation Message-ID: <199703031932.TAA20716@mail.iol.ie> [Tim Bray] >Whereas I agree with the rest of Francois' contribution, this paragraph >is not quite right. If you change "ESIS event stream" to "Instance >character stream", then it would be more correct. But in fact the >SGML->SGML declaration was not one of our goals; for example, the >processor is not required to tell the app about [at least] comments >and merely, in a very abstract way, what the processor has to give the >application. > >If either of these problems (the impossibility of SGML->SGML or the >absence of an ESIS equivalent) is a big huge flaw in XML, there's still >time to fix it. The SGML->SGML problem is probably a job for the >XML WG. The ESIS issue is perhaps a job for this list. I personally >think an API is better than an ESIS [even if the ESIS were properly >defined] anyhow. I am a big fan on SGML->SGML and would like to see the ESIS powerful enough to allow it at some level (i.e. even with certain restrictions is better than not at all). I think it is important that the API - whatever form it finally takes - is not merely an API for *rendering XML*. What about all the XML processing apps that will be slurping XML prior to any XML publishing. Another point in favour of an ESIS rather than a function/method API is that the ESIS approach is automatically bi-directional. I.e. can be used to create XML as well as process it. Sean Sean Mc Grath digitome@iol.ie Digitome Electronic Publishing Developers of IDM - Next Generation SGML Transformation Technology http://www.screen.ie/digitome xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From richard at light.demon.co.uk Tue Mar 4 10:33:08 1997 From: richard at light.demon.co.uk (Richard Light) Date: Mon Jun 7 16:57:28 2004 Subject: Simple approaches to XML implementation In-Reply-To: <199703031932.TAA20716@mail.iol.ie> Message-ID: In message <199703031932.TAA20716@mail.iol.ie>, "Digitome Ltd." writes >I am a big fan on SGML->SGML and would like to see the ESIS powerful >enough to allow it at some level (i.e. even with certain restrictions is >better than not at all). I think it is important that the API - whatever >form it finally takes - is not merely an API for *rendering XML*. What about >all the XML processing apps that will be slurping XML prior to any >XML publishing. > >Another point in favour of an ESIS rather than a function/method API is that >the ESIS approach is automatically bi-directional. I.e. can be used to create >XML as well as process it. I think we need both. Surely the API is the set of commands, switches, etc. which the application can use to control the behaviour of the XML processor and issue requests to it, while the "ESIS" is the well- understood format in which the XML processor serves up the requested results to the application? Is it fair to say that the XML API is functionally equivalent to the command line arguments in NSGMLS, while the "XML ESIS" is (more obviously) equivalent to the ESIS output by NSGMLS? That's how I tend to see it. The advantage of an API over an NSGMLS-style command line is that you can have any number of bites at the cherry, retrieving relevant bits of the XML document each time. For example, a browsing app might start by requesting the only element structure for the whole document (to fill an 'outline' window), then go back and ask for content for the first few elements until it had enough to fill a 'data window'. Richard Light SGML and Museum Information Consultancy richard@light.demon.co.uk 3 Midfields Walk Burgess Hill West Sussex RH15 8JA U.K. tel. (44) 1444 232067 xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From housel at ms7.hinet.net Tue Mar 4 16:04:52 1997 From: housel at ms7.hinet.net (Peter S. Housel) Date: Mon Jun 7 16:57:28 2004 Subject: Simple approaches to XML implementation Message-ID: <199703041556.XAA06348@ms7.hinet.net> Here's what I'm looking for in an XML/SGML API. First I want an entry point that allows the application to query the parser implementation what property set modules it supports (i.e., what's the richest grove plan available to users), and whether or not validation is available. Next, there should be an XMLEventStream object. To create an XMLEventStream, you specify: 1. The URL of the document to open. 2. The grove plan you want, in the form of a list of property set modules. 3. Whether or not you want validation done. This gives a stream-based interface to the XML document. By default, the grove plan would be {baseabs, prlgabs0, instabs}, which gives you a stream of XMLEvent objecs that corresponds almost exactly to ESIS. If you ask for more modules than that, XMLEventStream will give you a richer set of objects, a stream including such things as the contents of the DTD, comments, or whatever you like. As a layer above that, there should be a grove-based interface that takes the stream and turns it into a grove. Once built, the grove can be examined using an interface similar to SDQL. As someone has already noted, there should be concrete subclasses of the Node class, but the property-getting interface should work whether the property is stored in a list of properties or in a special instance slot. I'm very fond of the idea of mapping documents of various MIME types onto XML documents. The translator could work as an XMLEventStream, making the grove-building machinery common to all document types. Still higher application-specific layers could be built easily. Am I way off base here? I know this is the kind of interface that would make me happy... -Peter S. Housel- housel@ms7.hinet.net xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From dgd at cs.bu.edu Tue Mar 4 16:50:52 1997 From: dgd at cs.bu.edu (David Durand) Date: Mon Jun 7 16:57:28 2004 Subject: Simple approaches to XML implementation In-Reply-To: References: <199703031932.TAA20716@mail.iol.ie> Message-ID: At 9:55 AM +0000 3/4/97, Richard Light wrote: >I think we need both. Surely the API is the set of commands, switches, >etc. which the application can use to control the behaviour of the XML >processor and issue requests to it, while the "ESIS" is the well- >understood format in which the XML processor serves up the requested >results to the application? ESIS and the text format that sgmls (and now SP) server up are different things. The ESIS is an informal, non-normative definition of information that an SGML application can see. The text format is one way to transmit that information. I am with the rest on requiring the potential for XML->XML transformation, One reason that I pressed to have no insignificant whitespace -- because it's only parts of the document that you can't see that can bite you. Personally, I think we need an API with more power than ESIS, and secondarily should strongly consider a tree-style representation that can be optionally produced. >Is it fair to say that the XML API is functionally equivalent to the >command line arguments in NSGMLS, while the "XML ESIS" is (more >obviously) equivalent to the ESIS output by NSGMLS? That's how I tend >to see it. I think that the API includes 1 call for each kind of information that can pass between the parser and the application, and _also_ an interface for setting options. >The advantage of an API over an NSGMLS-style command line is that you >can have any number of bites at the cherry, retrieving relevant bits of >the XML document each time. This is the advantage of a parse tree-style representation -- But is likely to be too slow for simple callbacks -- re-parsing documents moves to much data to be attractive unless you're way memory limited. It's actually a good argument for a way to request that a stored tree be traversed to produce callbacks just as if a parse were being created. > For example, a browsing app might start by >requesting the only element structure for the whole document (to fill an >'outline' window), then go back and ask for content for the first few >elements until it had enough to fill a 'data window'. -- David _________________________________________ David Durand dgd@cs.bu.edu \ david@dynamicDiagrams.com Boston University Computer Science \ Sr. Analyst http://www.cs.bu.edu/students/grads/dgd/ \ Dynamic Diagrams --------------------------------------------\ http://dynamicDiagrams.com/ MAPA: mapping for the WWW \__________________________ xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From gtn at ebt.com Tue Mar 4 17:35:53 1997 From: gtn at ebt.com (Gavin Nicol) Date: Mon Jun 7 16:57:28 2004 Subject: Simple approaches to XML implementation In-Reply-To: (dgd@cs.bu.edu) Message-ID: <199703041733.MAA04237@nathaniel.ebt> >Personally, I think we need an API with more power than ESIS, and >secondarily should strongly consider a tree-style representation that can >be optionally produced. Agreed. >I think that the API includes 1 call for each kind of information that can >pass between the parser and the application, and _also_ an interface for >setting options. I would tend toward an event-driven interface, and an option-setting interface as the core parser API. For example: class XMLEventHandler { public boolean OnComment(String comment); public boolean OnElementStart(...) .... } class XMLParser { ... parser(XMLEventHandler handler); ... } I have some code now that does this, and it works very well. >It's actually a good argument for a way to request that a stored tree be >traversed to produce callbacks just as if a parse were being created. One kind of handler I have is one that build a tree: it also happens to implement the XMLEventGenerator interface so that I can use it to feed an event handler. xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From peter at techno.com Tue Mar 4 17:48:28 1997 From: peter at techno.com (Peter Newcomb) Date: Mon Jun 7 16:57:28 2004 Subject: Simple approaches to XML implementation In-Reply-To: <199703041556.XAA06348@ms7.hinet.net> (housel@ms7.hinet.net) Message-ID: <199703041745.MAA11686@exocomp.techno.com> > From: "Peter S. Housel" > Date: Wed, 5 Mar 1997 00:05:30 +0800 > > Here's what I'm looking for in an XML/SGML API. > > First I want an entry point that allows the application to query > the parser implementation what property set modules it supports > (i.e., what's the richest grove plan available to users), and whether > or not validation is available. This should be implemented both as a set of API methods (for tightly coupled applications) and as a machine-readable (SGML) system declaration document (for remote and/or code-incompatible applications), and (for SGML/HyTime systems) I think should also include information about what storage managers, architecture engines, and other notation processors (for multimedia addressing) are available. > Next, there should be an XMLEventStream object. To create an > XMLEventStream, you specify: > > 1. The URL of the document to open. > 2. The grove plan you want, in the form of a list of property set modules. > 3. Whether or not you want validation done. > > This gives a stream-based interface to the XML document. By default, the > grove plan would be {baseabs, prlgabs0, instabs}, which gives you a stream > of XMLEvent objecs that corresponds almost exactly to ESIS. > > If you ask for more modules than that, XMLEventStream will give you > a richer set of objects, a stream including such things as the > contents of the DTD, comments, or whatever you like. > > As a layer above that, there should be a grove-based interface that > takes the stream and turns it into a grove. Once built, the grove > can be examined using an interface similar to SDQL. As someone has > already noted, there should be concrete subclasses of the Node > class, but the property-getting interface should work whether the > property is stored in a list of properties or in a special instance > slot. Agreed. However I do not think that an API specification should dictate whether the grove is built from the event stream or the event stream from the grove; I would regard that as an implementation issue since some applications may choose to store documents as character streams and others as groves (or collections of objects similar to groves). The important thing is that both APIs (event stream and grove) be provided. > I'm very fond of the idea of mapping documents of various MIME types > onto XML documents. The translator could work as an XMLEventStream, > making the grove-building machinery common to all document types. > > Still higher application-specific layers could be built easily. I'm not sure exactly what you mean by "mapping documents of various MIME types onto XML documents" though I would be interested to know. I do believe, however, that an API designed along these lines would make a wide variety of applications both possible and relatively easy to implement. > Am I way off base here? I know this is the kind of interface that would > make > me happy... What you suggest is _exactly_ in line with what I want and have been developing. -peter -- Peter Newcomb TechnoTeacher, Inc. 233 Spruce Avenue P.O. Box 23795 Rochester, NY 14611-4041 USA Rochester, New York 14692-3795 USA +1 716 464 8696 (home) +1 716 464 8696 (direct) +1 716 755 8698 (cell) +1 716 271 0796 (main) +1 716 529 4304 (fax) +1 716 271 0129 (fax) peter@petes-house.rochester.ny.us peter@techno.com http://www.petes-house.rochester.ny.us http://www.techno.com xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From ddb at criinc.com Tue Mar 4 18:32:10 1997 From: ddb at criinc.com (Derek Denny-Brown) Date: Mon Jun 7 16:57:28 2004 Subject: Simple approaches to XML implementation Message-ID: <3.0.32.19970304102700.0091a190@mailhost.criinc.com> At 12:05 AM 3/5/97 +0800, Peter S. Housel wrote: >First I want an entry point that allows the application to query >the parser implementation what property set modules it supports >(i.e., what's the richest grove plan available to users), and whether >or not validation is available. > >Next, there should be an XMLEventStream object. To create an >XMLEventStream, you specify: > >1. The URL of the document to open. >2. The grove plan you want, in the form of a list of property set modules. >3. Whether or not you want validation done. I would much rather abstract the document source, or talk about a 'Provider' or womthing like that. It is usefull to be able to just hand the parser a URL but that is not the only way I want to pass it documents, and I may want it to use my URL handling code (for a shared cache) etc. It makes it more complicated but it increases the flexibilty tremendously. I agree that a grove plan is a good way to talk about event options, but it is not the best way to pass it around. I think htere should be an (set of) object(s) which describe the 'grove plan' and instruct the parser what it should hand off to the event handler. The parser is then free to inform the event handler/application that it does not support certain operations. At 12:33 PM 3/4/97 -0500, Gavin Nicol wrote: >>I think that the API includes 1 call for each kind of information that can >>pass between the parser and the application, and _also_ an interface for >>setting options. > >I would tend toward an event-driven interface, and an >option-setting interface as the core parser API. For example: > > class XMLEventHandler { > public boolean OnComment(String comment); > public boolean OnElementStart(...) > .... > } > > class XMLParser { > ... > parser(XMLEventHandler handler); > ... > } This would be the best (performance wise) way to do this. It works for SP and I found it very easy to deal with in that environment. -derek -------------------------------------------------------------- ddb@criinc.com || software-engineer || www/sgml/java/perl/etc. xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From dgd at cs.bu.edu Tue Mar 4 18:50:51 1997 From: dgd at cs.bu.edu (David Durand) Date: Mon Jun 7 16:57:28 2004 Subject: API thoughts... Message-ID: I was thinking that my earlier comments have been a bit too abstract, and Richard's post got me thinking about what kinds of calls we might like to have ... so I'm going to post some incomplete Java declarations that express the kind of protocol that I'm suggesting. Details are not the issue here, but the overall structure is what I'm proposing. Norbert's parser is rather similar to this in some respects, as far as I've seen (unpacked + browsed source, but not executed yet). /** A XMLParser can be constructed with explicit or default options, and always takes an XMLBuilder as an argument. The XMLBuilder is an interface that implements a callback for each significant event that the parser detects. I think we should also provide a dummy starter class that implements XMLBuilder, and implements null operations on each event -- otherwise we're making implementors perform a typing exercise for events they don't care about. */ public interface XMLParser { /** You can't put constructors in an interface, but the idea should be clear. I'm not sure how Java maps to IDL anyway... */ Parser(XMLBuilder builder); // make a Parser to callback builder with // all options set to defaults Parser(XMLBuilder builder, XMLOptions options); // here we also set the options public void parse(String url_start); // start parsing a document public void parse(InputStream input); // start parsing a document // Methods like this may require a base URL argument. They also // might not make sense in the public interfaces... public void set_options(XMLOptions new_options); // change parsing options // One way to handle entity resolution is to make it part of the XMLBuilder // API, but it may be better to instead have a method like the following. // ... And of course a new "protocol object to encapsulate the operations public void set_entity_resolver(XMLResolver resolver); // Set external resolution strategy } /** If you pass an XMLTreeBuilder to an XMLParser it will create an XMLDocumentTree object, and return it to you, letting you keep the results of a parse. */ public class XMLTreeBuilder implements XMLBuilder { public XMLDocumentTree product(); // return the built tree after a parse. /* ... XMLBuilder operations omitted ... */ } /** An XMLDocumentTree should be the start of a nest of document representation classes. I don't have many special ideas here, and you all probably have a better idea about how it should work than I do. My one idea, is that it should be able to drive a Builder just the same way that a parser does. I'm not sure whether we should be providing classes like this, or if everything should be an interface.... */ public class XMLDocumentTree { /** This method takes options, and runs a builder over the document tree calling the builder for the virtual events found during traversal. Can be useful, if you want to build several different views of a document, without building them all in a single pass. */ public void traverse(XMLBuilder builder); // traverse the tree with // standard options public void traverse(XMLBuilder builder, XMLOptions options); // traverse with specified options. public XMLDocumentElement access_TEI_location(String TEIpointer); // We probably won't make methods like this part of the // public interface /* actual data access methods to be determined... I see two main approaches to creating the data access methods: 1. to create a bunch of particular objects Element, Attribute, etc. and allow looking at them directly. This does make for a rather fat interface, and a lot of objects. In some contexts this is good (low object coupling), in others, bad (currently applets pay a high price for using many classes, and this will take at least a year to improve). 2. Create a general node object that can represent an element or attribute or entity, etc, and use a general protocol to explicity test and act on node types, and to traverse. This is essentially the grove model, as I understand it. The disadvantage is that it's not very concrete, and so it's harder to understand. You also lose the ability to use type-based dispatching if your programming style favors it -- you have to test the generic nodes yourself. Either of these models is good, but we need to examine the tradeoffs much more carefully and explicitly. */ } /** Simple class that holds flags and other options for an XML parse or tree traversal. Default values are made by intitilization and can be overridden by subclassing and overriding, or by simply assigning values. */ public class XMLOptions { // there should be flags for each individual type of event. Since that will // be a lot of flags, we should consider having some flags that lump together // frequently occurring options. e.g.: public boolean visit_elements = true; // Visit elements public boolean element_start = true; // element open events public boolean element_end = true; // element close events public boolean expand_external_entities = true; // Should external entities // be automatically expanded? // .... } public interface XMLBuilder { // I've included a DocumentPosition for each item that has content. This // This is for full-text indexers, and the like. public void start_element(String name); // an element began public void attribute(String name, String value, AttributeDeclarationInfo attinfo); // attinfo may be null if the DTD // was not parsed, or the parser was requested to // discard such information. public void internal_entity_reference(String name, String value, String type); // Some applications will need to know this for XML->XML // transformation. It's also useful since we no // longer have SDATA public boolean external_entity_reference(String name, String value, String type, String notation_name); // The boolean return could be used to allow case-by-case // decisions on whether or not to expand the entity in line. // This is the alternative to making it just a global // option. // If an XMLDocument gets a request to parse an unparsed // external entity, it should create and invoke a new parser // with the options that it was originally created with, and // then resume traversing the new items (added to its tree). /* ... etc. ... */ } Just a sketch of the kind of API that I'd like to integrate with. -- David _________________________________________ David Durand dgd@cs.bu.edu \ david@dynamicDiagrams.com Boston University Computer Science \ Sr. Analyst http://www.cs.bu.edu/students/grads/dgd/ \ Dynamic Diagrams --------------------------------------------\ http://dynamicDiagrams.com/ MAPA: mapping for the WWW \__________________________ xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From peter at techno.com Tue Mar 4 19:56:12 1997 From: peter at techno.com (Peter Newcomb) Date: Mon Jun 7 16:57:28 2004 Subject: API thoughts... In-Reply-To: (dgd@cs.bu.edu) Message-ID: <199703041952.OAA11769@exocomp.techno.com> > Date: Tue, 4 Mar 1997 12:44:09 -0500 > From: dgd@cs.bu.edu (David Durand) > > I see two main approaches to creating the data access methods: > > 1. to create a bunch of particular objects Element, Attribute, etc. and > allow looking at them directly. This does make for a rather fat interface, > and a lot of objects. In some contexts this is good (low object coupling), > in others, bad (currently applets pay a high price for using many classes, > and this will take at least a year to improve). > > 2. Create a general node object that can represent an element or > attribute or entity, etc, and use a general protocol to explicity test and > act on node types, and to traverse. This is essentially the grove model, as > I understand it. The disadvantage is that it's not very concrete, and so > it's harder to understand. You also lose the ability to use type-based > dispatching if your programming style favors it -- you have to test the > generic nodes yourself. > > Either of these models is good, but we need to examine the tradeoffs > much more carefully and explicitly. I think that it is important to define the data access API in such a way as to facilitate the use of either or both models: If (1) is implemented as a bunch of classes for element, attribute, etc., but each of those is derived from a general node class as described in (2), then the best of both worlds is available, since applications (or language implementations) that cannot handle the overhead of (1) can use just the generic node interface, and applications that can take advantage of strong typing are not restricted to (2). Also, since the two schemes would be interoperable at the generic node level, different modules of the same application can use either model without regard for what model other modules use. -peter -- Peter Newcomb TechnoTeacher, Inc. 233 Spruce Avenue P.O. Box 23795 Rochester, NY 14611-4041 USA Rochester, New York 14692-3795 USA +1 716 464 8696 (home) +1 716 464 8696 (direct) +1 716 755 8698 (cell) +1 716 271 0796 (main) +1 716 529 4304 (fax) +1 716 271 0129 (fax) peter@petes-house.rochester.ny.us peter@techno.com http://www.petes-house.rochester.ny.us http://www.techno.com xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From gtn at ebt.com Tue Mar 4 20:06:50 1997 From: gtn at ebt.com (Gavin Nicol) Date: Mon Jun 7 16:57:28 2004 Subject: API thoughts... In-Reply-To: (dgd@cs.bu.edu) Message-ID: <199703042004.PAA04417@nathaniel.ebt> While I think we all tend toward similar designs I have a problem with this: > Parser(XMLBuilder builder, XMLOptions options); > // here we also set the options > > public void parse(String url_start); // start parsing a document I would rather pass the event *handler* into the parse() call, and for that matter, I would probably be even happier if I could also pass in the document to parse. xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From gtn at ebt.com Tue Mar 4 20:36:57 1997 From: gtn at ebt.com (Gavin Nicol) Date: Mon Jun 7 16:57:28 2004 Subject: Simple approaches to XML implementation In-Reply-To: <199703041745.MAA11686@exocomp.techno.com> (message from Peter Newcomb on Tue, 4 Mar 1997 12:45:47 -0500) Message-ID: <199703042033.PAA04428@nathaniel.ebt> > First I want an entry point that allows the application to query > the parser implementation what property set modules it supports > (i.e., what's the richest grove plan available to users), and whether > or not validation is available. I think that this is all stuff that should occur *above* the parser. Given an event-driver *parser* API, you can add validation and grove building serices on top, for little, or not overhead beyond what such system would normally incur. Let's first focus on a *parser* API. >Agreed. However I do not think that an API specification should >dictate whether the grove is built from the event stream or the event >stream from the grove; I would regard that as an implementation issue >since some applications may choose to store documents as character >streams and others as groves (or collections of objects similar to >groves). The important thing is that both APIs (event stream and >grove) be provided. This amounts to reflective API's: ie. a grove can build and event stream can build a grove. I have no problem with this as a general *document interface* API, and it's exactly what I have built in my various projects over the years. However, this is fundamentally *different* to the *parser* API. What is an XML parser? What does it consume? What does it produce? >I'm not sure exactly what you mean by "mapping documents of various >MIME types onto XML documents" though I would be interested to know. Something like what I said before: given a certain level of abstraction, syntax becomes irrelevant. XML would be just one of a number of different syntaxes for the same underlying representation (hey, anyone remember LISP?). xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From dgd at cs.bu.edu Wed Mar 5 04:47:51 1997 From: dgd at cs.bu.edu (David Durand) Date: Mon Jun 7 16:57:29 2004 Subject: API thoughts... In-Reply-To: <199703041952.OAA11769@exocomp.techno.com> References: (dgd@cs.bu.edu) Message-ID: At 2:52 PM -0500 3/4/97, Peter Newcomb wrote: >> From: dgd@cs.bu.edu (David Durand) >> >> I see two main approaches to creating the data access methods: 1 specific node types for each construct 2 generic node types, with special methods to test type and values. >> Either of these models is good, but we need to examine the tradeoffs >> much more carefully and explicitly. > >I think that it is important to define the data access API in such a >way as to facilitate the use of either or both models: If (1) is >implemented as a bunch of classes for element, attribute, etc., but >each of those is derived from a general node class as described in >(2), then the best of both worlds is available, since applications (or >language implementations) that cannot handle the overhead of (1) can >use just the generic node interface, and applications that can take >advantage of strong typing are not restricted to (2). Also, since the >two schemes would be interoperable at the generic node level, >different modules of the same application can use either model without >regard for what model other modules use. Your description of the tradeoffs confuses me. It seems that a set of specific classes corresponding to node types is the _easy_ solution for a programmer. You need a little dynamic typing in the content of elements, but you can let virtual methods do most of the work for you. With generic elements you need to test explicitly for everything. I can see that the "one-type" model may make some kinds of transformation engine (mainly ones that use the grove model already) and low level operations like serialization easier, but actually it seems that most applications _need_ to do something different for an attribute than an element, most of the time. The practical reason to have one type currently is just time on the wire for applets (if there are going to _be_ XML applets). One thing to thing about is what interface would be easier for an applet author to deal with. I can see applets that do custom rendering based on the markup of a subtree of the document, and can get their input pre-parsed (even validated by a DTD, if they want to cut some error code out). What interface makes implementing such an applet easiest? Anyhow, I don't yet have a very clear sense that the generic node approach offers much more than a trimmed class tree. Am I missing something? -- David _________________________________________ David Durand dgd@cs.bu.edu \ david@dynamicDiagrams.com Boston University Computer Science \ Sr. Analyst http://www.cs.bu.edu/students/grads/dgd/ \ Dynamic Diagrams --------------------------------------------\ http://dynamicDiagrams.com/ MAPA: mapping for the WWW \__________________________ xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From dgd at cs.bu.edu Wed Mar 5 04:48:00 1997 From: dgd at cs.bu.edu (David Durand) Date: Mon Jun 7 16:57:29 2004 Subject: Simple approaches to XML implementation In-Reply-To: <199703042033.PAA04428@nathaniel.ebt> References: <199703041745.MAA11686@exocomp.techno.com> (message from Peter Newcomb on Tue, 4 Mar 1997 12:45:47 -0500) Message-ID: At 3:33 PM -0500 3/4/97, Gavin Nicol wrote: >> First I want an entry point that allows the application to query >> the parser implementation what property set modules it supports >> (i.e., what's the richest grove plan available to users), and whether >> or not validation is available. > >I think that this is all stuff that should occur *above* the parser. >Given an event-driver *parser* API, you can add validation >and grove building serices on top, for little, or not overhead >beyond what such system would normally incur. Yes, exactly. We may decide that some of these are so important that they should be required from all implementations, and then we may not. But I think it's easier and more sensible to define the event interface first. Anyway, once we have an event-handler object, a structure holder can generate events for it, just as easily as the event handler can create a structure object. >>Agreed. However I do not think that an API specification should >>dictate whether the grove is built from the event stream or the event >>stream from the grove; I would regard that as an implementation issue >>since some applications may choose to store documents as character >>streams and others as groves (or collections of objects similar to >>groves). The important thing is that both APIs (event stream and >>grove) be provided. the event handler is an object with a lot of methods of the form: handle_some_event(here's the data); etc. The grove object is the kind of thing that has methods like: Type what_kind_of_node_am_i(); Node what's_my_ancestor(); String what's_my_name(); etc. I think we should provide both interfaces, and for some language binding (I like Java), we should implement one interface in terms of the other. But this actually only really makes sense one way.... >This amounts to reflective API's: ie. a grove can build and event >stream can build a grove. I have no problem with this as a general >*document interface* API, and it's exactly what I have built in my >various projects over the years. And I hope we all see how trivial it would be to even provide an implementation of both these operations. >However, this is fundamentally *different* to the *parser* API. What >is an XML parser? What does it consume? What does it produce? Exactly. And here the event model is _in some sense_ the foundation. Because parsers (even those that build trees automatically) typically recognize events before they create structures. So if we want to maximize the mix and match of our interfaces, we have a parser that sends events. And a document representation (structure) that accepts queries. We also provide _implementations_ of an event-handling object that builds a document representation, and the representation has a method to accept an event handler and pass it the events correspinding to the document's structure -- and we're there. >Something like what I said before: given a certain level of >abstraction, syntax becomes irrelevant. XML would be just one of a >number of different syntaxes for the same underlying representation >(hey, anyone remember LISP?). Just in case that's not clear, you can parse any old format and then decide to send XML events to represent whatever it was that the parse tree of the foreign format had in it... presto-change-o a GIF->XML translator is born! (now the question is how to kill it off again...) -- David _________________________________________ David Durand dgd@cs.bu.edu \ david@dynamicDiagrams.com Boston University Computer Science \ Sr. Analyst http://www.cs.bu.edu/students/grads/dgd/ \ Dynamic Diagrams --------------------------------------------\ http://dynamicDiagrams.com/ MAPA: mapping for the WWW \__________________________ xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From nmikula at edu.uni-klu.ac.at Wed Mar 5 08:06:06 1997 From: nmikula at edu.uni-klu.ac.at (Norbert H. Mikula) Date: Mon Jun 7 16:57:29 2004 Subject: API thoughts... References: Message-ID: <331DA7CE.ECF@edu.uni-klu.ac.at> I just want to describe briefly how to current API of NXP is designed. There is a Java interface declaration that I have called ESIS. Esis defines a set of callback routines which, I believe, is pretty close to what Esis is supposed to deliver to an application. I have designed this interface for my SGML parser Cappuccino. This was done quite a while ago and I never really got feedback on it. So it might not be complete. Applications can make use of ESIS by implementing that interface. One example, which is in the distribution, is to send the output to stdout. Yet another example, and I was using it that way with my SGML parser Cappuccino, is to built a grove. This grove can then be traversed and worked with. Also for the grove API there would need to be some discussion. I think Alex Milowski has already done some work on that. The treebuiler/tree-traversal idea also sounds good to me. Grove vs. Esis : I, as others, do believe that the grove builder can be seen as a layer above the Esis interface. However, I guess we will need to incrase the number of callbacks and hence probably find another name for the interface. For many applications creating a grove would probably be an overkill solution. -- Best regards, Norbert H. Mikula ===================================================== = SGML, DSSSL, Intra- & Internet, AI, Java ===================================================== = mailto:nmikula@edu.uni-klu.ac.at = http://www.edu.uni-klu.ac.at/~nmikula ===================================================== xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From Bill.Smith at Eng.Sun.COM Wed Mar 5 18:02:35 1997 From: Bill.Smith at Eng.Sun.COM (Bill Smith) Date: Mon Jun 7 16:57:29 2004 Subject: API thoughts... Message-ID: A small point, but one I think important. The term "callback" doesn't make much sense in Java since (if I remember correctly) you can't pass function pointers in Java - there are no pointers. If we are language-independent but object-oriented, callback is still the wrong term. Abstract class or interface would be more accurate. xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From gtn at ebt.com Wed Mar 5 19:55:32 1997 From: gtn at ebt.com (Gavin Nicol) Date: Mon Jun 7 16:57:29 2004 Subject: API thoughts... In-Reply-To: (message from Bill Smith on Wed, 5 Mar 1997 10:00:44 -0800 (PST)) Message-ID: <199703051952.OAA06206@nathaniel.ebt> >The term "callback" doesn't make much sense in Java since (if I remember >correctly) you can't pass function pointers in Java - there are no pointers. >If we are language-independent but object-oriented, callback is still the >wrong term. > >Abstract class or interface would be more accurate. While I agree with eveything, I should note that you can call a method by name in JAVA (ie. you can fake callbacks). xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From tbray at textuality.com Wed Mar 5 20:27:22 1997 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 16:57:29 2004 Subject: API thoughts... Message-ID: <3.0.32.19970305122554.007d55c0@pop.intergate.bc.ca> At 10:00 AM 3/5/97 -0800, Bill Smith wrote: >The term "callback" doesn't make much sense in Java since (if I remember >correctly) you can't pass function pointers in Java Well, the Lark event-stream API sure looks & feels like a bunch of callbacks. You make a Lark object, call its readXML() method with one argument being a Handler object; Handler being a data-less class that just has a bunch of methods called things like doPI() and doStartTag() and doEntityReference() and doText() and so on; you'd normally subclass Handler replacing the methods for the events you wanted to see, and pass in that kind of object. Lark calls these upon recognizing the constructs in the input stream, passing the byte offset info, the element & entity stack (*if* you're treebuilding), and other currently relevant info. These methods are all booleans; if any returns true, Lark stops and returns control to whoever called readXML(). Surely the GUI experience has taught us by now that a callback interface is the way to go... anyone remember [shudder] XNextEvent()? I am somewhat amused by all the Java propagandists saying "Java is so much safer because we don't have pointers"... of course most variables are in fact object pointers, and every object is in fact an Object, and every array is in fact an Object, and you sure can wreak some good old-fashioned C-style destruction on yourself when you accidentally treat a pointer to a "byte[] foo" (oops, an object not a pointer) as an oops-an-object-not-a-pointer-to-a "char foo[]". Still, java is appealingly clean. Note for XML developers... I just finished putting correct attribute defaulting (internal subset decls only, sorry) into Lark (new version soon) - it nearly doubled the number of parsing states and class file sizes... sigh. On another subject, I really have trouble with trying to pretend that Element and Attribute and Entity and so on are just flavors of some abstract Node thingie - the idea of having separate classes/objects for these things just feels natural at a really deep level. One of the *nice* things about SGML and XML is that even if the markup is complicated, the number of underlying objects is pretty limited and maps neatly into a class framework. - Tim xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From Bill.Smith at Eng.Sun.COM Wed Mar 5 22:16:17 1997 From: Bill.Smith at Eng.Sun.COM (Bill Smith) Date: Mon Jun 7 16:57:29 2004 Subject: API thoughts... Message-ID: > Well, the Lark event-stream API sure looks & feels like a bunch of > callbacks. You make a Lark object, call its readXML() method with one > argument being a Handler object; Handler being a data-less class that > just has a bunch of methods called things like doPI() and doStartTag() and > doEntityReference() and doText() and so on; you'd normally subclass Handler > replacing the methods for the events you wanted to see, and pass in > that kind of object. Lark calls these upon recognizing > the constructs in the input stream, passing the byte offset info, the > element & entity stack (*if* you're treebuilding), and other currently > relevant info. These methods are all booleans; if any returns true, > Lark stops and returns control to whoever called readXML(). Another way to do this is to have the Lark object (or interface) define the event methods rather than have a separate Handler object. When it's time to parse something, create a subclass that overrides the (standard) event methods for the Lark object. A possible advantage to this method is that it makes clear the inheritance relationship between the "standard" parser and something more specific. It is also "easier" to create a more specific parser from an exisiting parser object - simply subclass the existing parser and override the methods required to provide the desired new functionality. The Lark model "hides" the inheritance relationship in the Handler object making it necessary to look inside a Lark object to determine the type of a given parser (something you might need to do when debugging). An alternative is to create a new parser object that contains a subclassed event handler. This makes it possible to distinguish the type of parser at the "outer" level but requires two new objects instead of one to perform the subclass. I'm not a parser expert so the subclass model may not make any sense but this is a mechanism I have successfully used building other object-oriented (including GUI-based) systems. I have also used callbacks but find them most useful when forced to use C or other non-object-based languages. xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From bosak at atlantic-83.Eng.Sun.COM Thu Mar 6 00:19:01 1997 From: bosak at atlantic-83.Eng.Sun.COM (Jon Bosak) Date: Mon Jun 7 16:57:29 2004 Subject: Associating DSSSL style sheets with documents Message-ID: <199703060017.QAA00505@boethius.eng.sun.com> Within the next week or so, I expect to announce the availability of a Web server that can respond to certain kinds of URLs with an XML data stream (eventually, a variety of XML data streams). For our own design purposes, and also for the purposes of experimenters working with combined XML/DSSSL applications, I would like to see this group come up with an unofficial method or methods by which to associate a DSSSL style sheet with a particular chunk of XML. Such methods would be far in advance of the sgml-wg specification effort and subject to later revision, but given the influence of experimental implementations, I think that it's appropriate to put a little bit of thought into the design. One possible method suggested by James Clark (thank you, James) is to adopt the convention used by Jade in the absence of the -d option: replace the extension of the document entity's URL or file name with .dsl and fetch that. Thus, if a browser fetches http://docs.sun.com/foo/bar.html then it should also look for http://docs.sun.com/foo/bar.dsl and apply it to bar.html if found. This is appealingly straightforward, but I wonder how well it accommodates multiple stylesheets and stylesheets that use other notations (CSS, for example). Of course, we could deal with the second concern by saying that DSSSL is the default stylesheet language for XML experimentation and that we will figure out some way to accommodate other stylesheet languages later. James lists some other possibilities: | - a processing instruction somewhere in the prolog | | - a catalog entry that says unconditionally to use some DSSSL style | sheet | | - a catalog entry that associates a DSSSL style sheet with the public | identifier of a DTD | | - make the document serve also as a style sheet by making it conform | to the DSSSL architecture (this will work with Jade too) Any thoughts on this? I am, of course, particularly interested in hearing from those of you who are actively building DSSSL applications. Jon xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From gtn at ebt.com Thu Mar 6 00:40:23 1997 From: gtn at ebt.com (Gavin Nicol) Date: Mon Jun 7 16:57:29 2004 Subject: Associating DSSSL style sheets with documents In-Reply-To: <199703060017.QAA00505@boethius.eng.sun.com> (bosak@atlantic-83.Eng.Sun.COM) Message-ID: <199703060037.TAA06393@nathaniel.ebt> | - a catalog entry that says unconditionally to use some DSSSL style | sheet | | - a catalog entry that associates a DSSSL style sheet with the public | identifier of a DTD When MIME-SGML was still doing something useful, a proposal to add a SEMANTIC catalog entry was floated. This should be the preferred method, I think. (I've not looked at TR 9401 for some time, so it may have been supplanted). My next favourite would be to have some explicit link in the document itself (either PI, or element). xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From cbullard at hiwaay.net Thu Mar 6 01:04:10 1997 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 16:57:29 2004 Subject: Associating DSSSL style sheets with documents References: <199703060037.TAA06393@nathaniel.ebt> Message-ID: <331E17FA.7EA3@hiwaay.net> Gavin Nicol wrote: A catalog entry that associates per type is good, but it does tie you to the DTD. > My next favourite would be to have some explicit link in the > document itself (either PI, or element). We use that technique. It works but the user better keep up with the piece parts per instance. What works better but is even more hassle is to enable styles to be called from different parts of the document when needed. This should be something implementors can experiment with. A monolithic stylesheet that handles multiple DTDs is also an option and works when one is careful with the namespace. having styles that are local to parts of the document are useful, as you know, when one does not want to write a complex stylesheet for documents that have lots of context conditions. len xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From gtn at ebt.com Thu Mar 6 01:12:36 1997 From: gtn at ebt.com (Gavin Nicol) Date: Mon Jun 7 16:57:30 2004 Subject: Associating DSSSL style sheets with documents In-Reply-To: <331E17FA.7EA3@hiwaay.net> (message from len bullard on Wed, 05 Mar 1997 19:03:54 -0600) Message-ID: <199703060110.UAA06491@nathaniel.ebt> >A catalog entry that associates per type is good, but it >does tie you to the DTD. What do you mean "per type"? In DynaText, we actually use something like the proposal: SEMANTICS "popup" "ebt-fulltext-stylesheet" "Pop-Up Graphics" "grphpop.v" SEMANTICS "serif" "ebt-fulltext-stylesheet" "Serif Font" "serif.v" >Having styles that are local to parts of the document are useful, as >you know, when one does not want to write a complex stylesheet for >documents that have lots of context conditions. Yes. Multiple stylesheet could be easier than styles qualified by context in some cases. It really amounts to the same thing though the binding mechnism is different.... xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From edz at bsn.com Thu Mar 6 03:09:30 1997 From: edz at bsn.com (Edward C. Zimmermann) Date: Mon Jun 7 16:57:30 2004 Subject: Associating DSSSL style sheets with documents In-Reply-To: <199703060017.QAA00505@boethius.eng.sun.com> from "Jon Bosak" at Mar 5, 97 04:17:45 pm Message-ID: <199703060310.EAA06638@hampton.bsn.com> > > Within the next week or so, I expect to announce the availability of a > Web server that can respond to certain kinds of URLs with an XML data > stream (eventually, a variety of XML data streams). For our own > design purposes, and also for the purposes of experimenters working > with combined XML/DSSSL applications, I would like to see this group > come up with an unofficial method or methods by which to associate a > DSSSL style sheet with a particular chunk of XML. Such methods would > be far in advance of the sgml-wg specification effort and subject to > later revision, but given the influence of experimental > implementations, I think that it's appropriate to put a little bit of > thought into the design. > > One possible method suggested by James Clark (thank you, James) is to > adopt the convention used by Jade in the absence of the -d option: > replace the extension of the document entity's URL or file name with > .dsl and fetch that. Thus, if a browser fetches > > http://docs.sun.com/foo/bar.html > > then it should also look for > > http://docs.sun.com/foo/bar.dsl Since no public DTDs must have the DTD, viz a URL to DTD.. and from the name/path/URL to DTD then one can use the extension .dsl or whatever for the DSSSL. The problem with using .dsl as the map from the URL .extension is that if one has a class of documents built around a DTD and that has a DSSSL "style sheet" then one will either need to have a front-end server to manage this whole bit (why) or fill the place with symbollic links.. The problem with both are that proxy caches will get filled with redundant bits... Why not use the DTD URL as base? Either this or I don't understand what your aims are, or I'm totally lost:-) The other alternative, of course, would be to extend HTTP to return a request for the association of a URL to the .dsl from a file.. That is probably the better and more flexible way but it won't work with popular off-the-shelf browser and thus is ill-suited to experiments...... > > and apply it to bar.html if found. > > This is appealingly straightforward, but I wonder how well it > accommodates multiple stylesheets and stylesheets that use other > notations (CSS, for example). Of course, we could deal with the > second concern by saying that DSSSL is the default stylesheet language > for XML experimentation and that we will figure out some way to > accommodate other stylesheet languages later. > > James lists some other possibilities: > > | - a processing instruction somewhere in the prolog > | > | - a catalog entry that says unconditionally to use some DSSSL style > | sheet > | > | - a catalog entry that associates a DSSSL style sheet with the public > | identifier of a DTD > | > | - make the document serve also as a style sheet by making it conform > | to the DSSSL architecture (this will work with Jade too) > > Any thoughts on this? I am, of course, particularly interested in > hearing from those of you who are actively building DSSSL > applications. > > Jon > > > xml-dev: A list for W3C XML Developers > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ > To unsubscribe, send to majordomo@ic.ac.uk the following message; > unsubscribe xml-dev > List coordinator, Henry Rzepa (rzepa@ic.ac.uk) > > -- ______________________ Edward C. Zimmermann Basis Systeme netzwerk/Munich xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From cbullard at hiwaay.net Thu Mar 6 03:32:02 1997 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 16:57:30 2004 Subject: Associating DSSSL style sheets with documents References: <199703060110.UAA06491@nathaniel.ebt> Message-ID: <331E3AAC.1A9F@hiwaay.net> Gavin Nicol wrote: > > >A catalog entry that associates per type is good, but it > >does tie you to the DTD. > > What do you mean "per type"? In DynaText, we actually use something > like the proposal: > > SEMANTICS "popup" "ebt-fulltext-stylesheet" "Pop-Up Graphics" "grphpop.v" > SEMANTICS "serif" "ebt-fulltext-stylesheet" "Serif Font" "serif.v" I thought you meant, per DTD. > >Having styles that are local to parts of the document are useful, as > >you know, when one does not want to write a complex stylesheet for > >documents that have lots of context conditions. > > Yes. Multiple stylesheet could be easier than styles qualified by > context in some cases. It really amounts to the same thing though > the binding mechnism is different.... Yes and no. The problem with the FOSI was that even though it worked, it was hard to specify style on elements in context when the contexts were complex. We combine context and local stylesheets. So, a parentage can be used, but a local stylesheet can introduce a new one, so the complexity is localized as well. Conservation of complexity: we have more stylesheets to manage per instance. len xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From jjc at jclark.com Thu Mar 6 03:34:36 1997 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 16:57:30 2004 Subject: Associating DSSSL style sheets with documents Message-ID: <2.2.32.19970306032432.00a75998@jclark.com> At 19:37 05/03/97 -0500, Gavin Nicol wrote: >| - a catalog entry that says unconditionally to use some DSSSL style >| sheet >| >| - a catalog entry that associates a DSSSL style sheet with the public >| identifier of a DTD > >When MIME-SGML was still doing something useful, a proposal to >add a SEMANTIC catalog entry was floated. This should be the >preferred method, I think. As far as I remember the SEMANTIC catalog entry proposal had several arguments: a) the type of processing spec (DSSSL or EBT style sheets) b) the system identifier for the processing spec c) some sort of description you could display in a menu I think there was something else, but I don't remember. Requiring (c) is not apropriate for DSSSL, since DSSSL specification documents can contain multiple separate stylesheets each with their own description, which is specified inside the DSSSL specification document (the DESC attribute on the style-specification form). This seems like a general problem: different kinds of processing specification may require different sets of arguments to invoke them. Since vendors and users can add their own SGML Open entry types, I see no advantage to having a general SEMANTIC entry with a type attribute, rather than a separate entry type for each type of processing spec. So if we are going to use a catalog entry (and I'm not yet convinced this is the best solution) I would suggest having a simple DSSSL entry which looks like: DSSSL spec.dsl One complication is that a DSSSL spec is itself an SGML document, which may require a different catalog for parsing. This probably doesn't matter in the context of XML, but it does in SGML: the DSSSL spec may well need a different implied SGML declaration. I'm not sure what the best way to handle this is. Of course this doesn't completely solve the problem: we now have to figure out how to associate a catalog with an SGML document. The latest SGML Open draft requires (amongst other things) trying the URL/filename of the document with any extension replaced by .soc. James xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From jjc at jclark.com Thu Mar 6 03:34:46 1997 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 16:57:30 2004 Subject: Associating DSSSL style sheets with documents Message-ID: <2.2.32.19970306032436.00aa635c@jclark.com> At 16:17 05/03/97 -0800, Jon Bosak wrote: >One possible method suggested by James Clark (thank you, James) is to >adopt the convention used by Jade in the absence of the -d option: >replace the extension of the document entity's URL or file name with >.dsl and fetch that. Thus, if a browser fetches > > http://docs.sun.com/foo/bar.html > >then it should also look for > > http://docs.sun.com/foo/bar.dsl > >and apply it to bar.html if found. > >This is appealingly straightforward, but I wonder how well it >accommodates multiple stylesheets A DSSSL specification document can contain any number of distinct style specifications: it can also contain links to other DSSSL specification documents. >and stylesheets that use other >notations (CSS, for example). Use another extension. >James lists some other possibilities: > >| - a processing instruction somewhere in the prolog >| >| - a catalog entry that says unconditionally to use some DSSSL style >| sheet >| >| - a catalog entry that associates a DSSSL style sheet with the public >| identifier of a DTD >| >| - make the document serve also as a style sheet by making it conform >| to the DSSSL architecture (this will work with Jade too) Another possibility I forgot to mention is to have a parameter on the Content-Type header field: Content-Type: text/xml; stylesheet=foo.dsl This is only going to work in the context of HTTP. The type of the stylesheet could be indicated by its Content-Type, and the client could use content-type negotiation to ensure it gets the kind of stylesheet it can handle. Somebody should probably register a MIME content-type for DSSSL style sheets as well. James xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From housel at ms7.hinet.net Thu Mar 6 04:34:54 1997 From: housel at ms7.hinet.net (Peter S. Housel) Date: Mon Jun 7 16:57:30 2004 Subject: Simple approaches to XML implementation Message-ID: <199703060427.MAA29885@ms7.hinet.net> Gavin Nicol (gtn@ebt.com) wrote: >I would tend toward an event-driven interface, and an >option-setting interface as the core parser API. For example: > >class XMLEventHandler { >public boolean OnComment(String comment); >public boolean OnElementStart(...) >.... >} > >class XMLParser { >... >parser(XMLEventHandler handler); >... >} That's one way of doing things. The main problem I see with this interface is that there are quite a few possible methods (I count 71 classdefs in the SGML property set, though of course not all of those are applicable to XML), and it becomes difficult to expand the set of events. There's also the issue of "who's in charge?" This is actually a tough issue. I like the way P.J. Plaugher put it in Programming on Purpose: when you're designing the program's architecture, first you draw a graph of nodes, with arrows showing the flow of information from subsystem to subsystem. Then you grab a node and shake the graph. What you get is your call graph, with the main processing loop located in the node you shook, making requests to the other subsystems. As much as possible, a good reusable component should not force the user's hand when choosing what node to grab onto. As an example, YACC is pretty bad about this. You supply it with a lexer (with a fixed name) and a set of handlers to be called when productions are reduced. The YACC-generated parser insists on being in charge. If all of today's popular languages had coroutines, we wouldn't have this problem. Every component could be written as if it were in charge. Unfortunately, most languages don't have a portable coroutine facility. For an XML document parsing system, the components we need to consider are: 1. An external entity manager, responsible for obtaining document instances (the "start" document and others), DTD's, etc. from local storage, the web, some database, etc. This should probably be user-customizable. 2. An encoding manager, responsible for mapping one of the possible XML document encodings (Latin-n, UTF-7, UTF-8, UCS-2, UTF-16, whatever) onto ISO10646 characters. 3. The parser itself, responsible for turning characters into XML events, and possibly into grove structures. 4. The user's application. As far as I can see, we have the following scenarios: * [Browser] If you're building a web browser, you want the network interrupt to be in charge. That is, when a packet's worth of document/DTD/whatever data comes in from the net, the parser should use that to parse as much of the document as it can, and pass as many events on to the application as possible. This gives optimal user response, provided you don't need the whole document to start displaying it. The external entity manager would have a callback for requesting additional external entities, that would add the request to an internal queue and return immediately to the parser. In this architecture, the user would create a parser object by specifying an external entity manager callback, a set of parser options (grove plan, validate or not, etc.), and an XMLEventHandler like the one shown above. Then your external entity manager would send a message to the parser object giving it a buffer full of bytes and an indication of which entity they belong to. * [YACC] You may want the parser to be in charge, like YACC. In this case you would call the parser, specifying the external event manager object (written using the Strategy pattern), list of options, and an XMLEventHandler object (which corresponds to the Builder pattern). * [XMLEventStream] You want some part or another of your application to be in charge, and you want a stream of XMLEvent objects. In this case, you create a parser object (XMLEventStream), specifying an external entity manager object, a start document, and a list of options. You send a message to this object whenever you want another event from the stream. * [Grove] You want to access nodes in a grove. So, you pass in your start document, your start document, and your options, and you get a root node back. The parser might construct the whole grove, or do it lazily when you ask for a property that hasn't been computed yet. These scenarios assume that the document(s) are stored in ordinary files or on the web. As Peter Newcombe pointed out, another scenario is when the document is stored in a database, possibly in grove form. In this case being able to specify an entity manager probably isn't desirable, and the [Browser] scenario probably doesn't fit at all. So, which of these scenarios do we want to specify for an XML API? Should all of them be? Should [Browser] be one of the ones included? [Browser] gives the most complicated parser, since it has to asynchronously handle information from several different documents. [YACC] is the easiest to write, but it's less flexible. Given [Browser], it's easy to write [YACC]. (Given [XMLEventStream] you can also derive [YACC], but with greater overhead.) [XMLEventStream] and [Grove] give you the most flexibility with respect to the grove plan. Hope this helps to clarity the issues a little. I've been thinking about this for awhile, in the context of reusable parser components for programming languages, but the only firm conclusion I've come to is that I really wish I could use coroutines. -Peter- xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From bosak at atlantic-83.Eng.Sun.COM Thu Mar 6 05:02:26 1997 From: bosak at atlantic-83.Eng.Sun.COM (Jon Bosak) Date: Mon Jun 7 16:57:30 2004 Subject: Associating DSSSL style sheets with documents In-Reply-To: <199703060310.EAA06638@hampton.bsn.com> (edz@bsn.com) Message-ID: <199703060501.VAA00624@boethius.eng.sun.com> [Edward C. Zimmermann:] | The problem with using .dsl as the map from the URL .extension is that | if one has a class of documents built around a DTD and that has a | DSSSL "style sheet" then one will either need to have a front-end | server to manage this whole bit (why) or fill the place with symbollic | links.. The problem with both are that proxy caches will get filled | with redundant bits... Why not use the DTD URL as base? Although we're in the habit of thinking this way, there is in fact no one-to-one correspondence between stylesheets and DTDs. It is possible to write a catch-all stylesheet that will work with documents written to a number of DTDs, and conversely it's not only possible but often desirable to create a number of stylesheets that are all designed to work with documents written to a single DTD. And then there's the fact that DTDs are optional in XML... Jon xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From bosak at atlantic-83.Eng.Sun.COM Thu Mar 6 05:21:05 1997 From: bosak at atlantic-83.Eng.Sun.COM (Jon Bosak) Date: Mon Jun 7 16:57:30 2004 Subject: Associating DSSSL style sheets with documents In-Reply-To: <2.2.32.19970306032436.00aa635c@jclark.com> (message from James Clark on Thu, 06 Mar 1997 10:24:36 +0700) Message-ID: <199703060519.VAA00633@boethius.eng.sun.com> [James Clark:] | >This is appealingly straightforward, but I wonder how well it | >accommodates multiple stylesheets | | A DSSSL specification document can contain any number of distinct style | specifications: it can also contain links to other DSSSL specification | documents. That's what I thought, but I haven't tried it yet. | >and stylesheets that use other | >notations (CSS, for example). | | Use another extension. Yes, but then how do you determine precedence if both foo.dsl and foo.css are found? That's why I said that a way out (admittedly not a very good one) would be to default to DSSSL for the time being. | Another possibility I forgot to mention is to have a parameter on the | Content-Type header field: | | Content-Type: text/xml; stylesheet=foo.dsl | | This is only going to work in the context of HTTP. Well, HTTP delivery is what I'm trying to get set up. | The type of the stylesheet could be indicated by its Content-Type, and | the client could use content-type negotiation to ensure it gets the | kind of stylesheet it can handle. Somebody should probably register a | MIME content-type for DSSSL style sheets as well. Both the xml and dsssl content-type registrations are hanging right now for lack of time to deal with the IANA paperwork. I'd be glad to hand these off to any individual or ad hoc working group with experience in MIME type registration. Jon xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From jjc at jclark.com Thu Mar 6 06:28:34 1997 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 16:57:30 2004 Subject: Associating DSSSL style sheets with documents Message-ID: <2.2.32.19970306061830.00a94b70@jclark.com> At 21:19 05/03/97 -0800, Jon Bosak wrote: >| >and stylesheets that use other >| >notations (CSS, for example). >| >| Use another extension. > >Yes, but then how do you determine precedence if both foo.dsl and >foo.css are found? That's only a problem if there's both DSSSL and CSS style sheets available and the client can handle both DSSSL and CSS. In that case, I would leave it up to the client to choose which it prefers. The content provider isn't really in a position to make that decision: if the client has a very complete CSS implementation but only a rather limited DSSSL implementation, the CSS style sheet may be preferable; but if the client has a complete DSSSL implementation, the DSSSL style sheet may be preferable. You're going to have this issue whatever mechanism you use for associating style sheets. James xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From nmikula at edu.uni-klu.ac.at Thu Mar 6 08:03:20 1997 From: nmikula at edu.uni-klu.ac.at (Norbert H. Mikula) Date: Mon Jun 7 16:57:30 2004 Subject: Which style first ? Re: Associating DSSSL style sheets with documents References: <199703060017.QAA00505@boethius.eng.sun.com> Message-ID: <331EF2DA.4165@edu.uni-klu.ac.at> If we talk about the problems of associations, we should also talk about the problem of what style, out of a list of style-specs., a DSSSL engine is supposed to pick out. DSSSL supports the definition of multiple style-sheets in one style-document. I do believe that this is a very important feature and we need to discuss how to employ it to the fullest. If I remember correctly, Jade picks out the first it found, and, I have to admit I don't have the latest Jade version installed, doesn't do anything with the others. My DSSSL engine, YADE, reads all the specs. and then picks out the first one that is has read and does the first rendering with it. All the other styles are put in a list which the user, via a menu, can choose from. I see the following problem. I am working on a DSSSL document that has one style for hardcopy and a variety of styles for online rendering. How are we going to tell the DSSSL engine what style to start with. The document/style author does not know what kind of DSSSL engine is going to work with his document. Thus the stylespec. itself must provide means to communicate to the DSSSL engine what kind of DSSSL engine i.e. online, hardcopy etc. a style is suitable for. Any ideas ? What about adding an attribute with a list of catagories of DSSSL engines as possible attribute values. For instance : output (hardcopy|online|....?) hardcopy -- Best regards, Norbert H. Mikula ===================================================== = SGML, DSSSL, Intra- & Internet, AI, Java ===================================================== = mailto:nmikula@edu.uni-klu.ac.at = http://www.edu.uni-klu.ac.at/~nmikula ===================================================== xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From nmikula at edu.uni-klu.ac.at Thu Mar 6 08:03:36 1997 From: nmikula at edu.uni-klu.ac.at (Norbert H. Mikula) Date: Mon Jun 7 16:57:30 2004 Subject: Associating DSSSL style sheets with documents References: <199703060017.QAA00505@boethius.eng.sun.com> Message-ID: <331EED8D.7312@edu.uni-klu.ac.at> Jon Bosak wrote: > One possible method suggested by James Clark (thank you, James) is to > adopt the convention used by Jade in the absence of the -d option: > replace the extension of the document entity's URL or file name with > .dsl and fetch that. Thus, if a browser fetches > > http://docs.sun.com/foo/bar.html > > then it should also look for > > http://docs.sun.com/foo/bar.dsl > > and apply it to bar.html if found. That is the way I have been doing it with my Cappuccino/Yade/PSC_EDB system. My NXP/Yade/PSC_EDB system, at least for now, will also do it like this. If you have only a few document instances it works fine. If you have hundrets of them you probably will get into troubles (for the reason already pointed out by others) > James lists some other possibilities: > > | - a processing instruction somewhere in the prolog I think, at least for XML, we don't want to use PIs too often. > | - a catalog entry that says unconditionally to use some DSSSL style > | sheet Maybe as a fallbak if other asscociation mechanisms failed.... > | - a catalog entry that associates a DSSSL style sheet with the public > | identifier of a DTD Hmm, I think I would like that. > | - make the document serve also as a style sheet by making it conform > | to the DSSSL architecture (this will work with Jade too) Do I understand correct that you want to include the style into the actual document instance ? Would work of course, but I guess only if we assume that DSSSL is the only style-spec. mechanism. Of course I would support that ;-) No, to be serious, I think we should separate style and instance as much as we can. ------ I do also think the idea with the mime header is worth a try. -- Best regards, Norbert H. Mikula ===================================================== = SGML, DSSSL, Intra- & Internet, AI, Java ===================================================== = mailto:nmikula@edu.uni-klu.ac.at = http://www.edu.uni-klu.ac.at/~nmikula ===================================================== xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From tms at ansa.co.uk Thu Mar 6 11:00:56 1997 From: tms at ansa.co.uk (Toby Speight) Date: Mon Jun 7 16:57:30 2004 Subject: Associating DSSSL style sheets with documents In-Reply-To: bosak@atlantic-83.Eng.Sun.COM's message of Wed, 5 Mar 1997 16:17:45 -0800 References: <199703060017.QAA00505@boethius.eng.sun.com> Message-ID: A non-text attachment was scrubbed... Name: not available Type: text/plain (pgp signed) Size: 2191 bytes Desc: not available Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19970306/0b73020c/attachment.bin From tms at ansa.co.uk Thu Mar 6 11:20:40 1997 From: tms at ansa.co.uk (Toby Speight) Date: Mon Jun 7 16:57:30 2004 Subject: Associating DSSSL style sheets with documents In-Reply-To: Toby Speight's message of 06 Mar 1997 10:59:48 +0000 References: <199703060017.QAA00505@boethius.eng.sun.com> Message-ID: A non-text attachment was scrubbed... Name: not available Type: text/plain (pgp signed) Size: 956 bytes Desc: not available Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19970306/5622930a/attachment.bin From jjc at jclark.com Thu Mar 6 12:38:09 1997 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 16:57:30 2004 Subject: Which style first ? Re: Associating DSSSL style sheets with documents Message-ID: <2.2.32.19970306122748.00ad4c24@jclark.com> At 08:37 06/03/97 -0800, Norbert H. Mikula wrote: >DSSSL supports the definition of multiple style-sheets in >one style-document. I do believe that this is a very important >feature and we need to discuss how to employ it to the fullest. > >If I remember correctly, Jade picks out the first it found, >and, I have to admit I don't have the latest Jade version >installed, doesn't do anything with the others. Actually if you use -d style.dsl#hardcopy, it will use the spec called hardcopy in the style.dsl document. (I forgot to mention this in the docs.) >My DSSSL engine, YADE, reads all the specs. and then picks >out the first one that is has read and does the first >rendering with it. All the other styles are put in a >list which the user, via a menu, can choose from. > >I see the following problem. I am working on a DSSSL document >that has one style for hardcopy and a variety of styles for >online rendering. How are we going to tell the DSSSL engine >what style to start with. Where the content provider knows which style they want, they can use a fragment spec in the URL to pick out a particular spec from the document. >The document/style author does not know what kind of DSSSL engine >is going to work with his document. Thus the stylespec. itself >must provide means to communicate to the DSSSL engine >what kind of DSSSL engine i.e. online, hardcopy etc. a style >is suitable for. > >Any ideas ? > >What about adding an attribute with a list of catagories >of DSSSL engines as possible attribute values. > >For instance : output (hardcopy|online|....?) hardcopy What other categories are there? If a style sheet is for online use, then it has to use the scroll flow object, which means it ought to list the online feature in the features element type form. Maybe a DSSSL engine could use this, or maybe it could look to see which flow object classes the spec uses. James xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From jjc at jclark.com Thu Mar 6 12:39:05 1997 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 16:57:30 2004 Subject: Associating DSSSL style sheets with documents Message-ID: <2.2.32.19970306122841.009c9094@jclark.com> At 08:15 06/03/97 -0800, Norbert H. Mikula wrote: >> | - a processing instruction somewhere in the prolog > >I think, at least for XML, we don't want to use PIs too often. In general I agree. But when you specify a style sheet, you surely are giving an instruction about the processing of the document. If PIs aren't appropriate for this, what are they good for? Since we have them, why not use them? If we allow the PI to occur anywhere in the prolog, then if a user has a style sheet for some DTD, they can simply add this PI to the DTD, and all documents conforming to the DTD will automatically use the style sheet. If you have an SGML system that supports FSIs, you could even have something like: PUBLIC "-//...//DTD Docbook//EN" "docbook.dtd" and associate the DTD with the stylesheet without changing the DTD. Other advantages: - it doesn't require a catalog, so you don't have the problem of finding that; - it works even for dynamic XML (eg generated on the fly in response to a query); - it works both locally and over HTTP. >> | - a catalog entry that associates a DSSSL style sheet with the public >> | identifier of a DTD > >Hmm, I think I would like that. I think this is handy for simple not very customizable DTDs (eg HTML). But I don't think it's enough just to key of the public id in the doctype declaration, because sometimes you need to add declarations *after the DTD*, which means you have to reference the DTD with an entity declaration rather than in the doctype declaration, eg %dtd; ]> I think you need to have a scheme that considers the public ids of all external parameter entities referenced in the DTD. I don't think any single method is adequate by itself. I think we need two or three methods that complement each other. I would also like to have something that will work equally for SGML and XML. James xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From housel at ms7.hinet.net Thu Mar 6 12:50:32 1997 From: housel at ms7.hinet.net (Peter S. Housel) Date: Mon Jun 7 16:57:30 2004 Subject: Simple approaches to XML implementation Message-ID: <199703061242.UAA29908@ms7.hinet.net> One more scenario: * [YACC+] is like [YACC], except that it returns control before parsing the whole instance. For instance, it might parse an element and then pass on events for element-start and all of the attributes, then return control. NXP uses the [YACC] scenario. Lark is a flexible version of [YACC+] that allows the handlers to determine on an event-by-event basis when the parser should return. [Grove] and [XMLEventStream] could be built in Lark, as could [YACC], but [Browser] could not. -Peter S. Housel- housel@ms7.hinet.net xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From gtn at ebt.com Thu Mar 6 13:39:37 1997 From: gtn at ebt.com (Gavin Nicol) Date: Mon Jun 7 16:57:30 2004 Subject: Associating DSSSL style sheets with documents In-Reply-To: <331E3AAC.1A9F@hiwaay.net> (message from len bullard on Wed, 05 Mar 1997 21:31:56 -0600) Message-ID: <199703061336.IAA06872@nathaniel.ebt> >> Yes. Multiple stylesheet could be easier than styles qualified by >> context in some cases. It really amounts to the same thing though >> the binding mechnism is different.... > >Yes and no. The problem with the FOSI was that even though it >worked, it was hard to specify style on elements in context >when the contexts were complex. We combine context and >local stylesheets. So, a parentage can be used, but a local >stylesheet can introduce a new one, so the complexity is >localized as well. Conservation of complexity: we have >more stylesheets to manage per instance. Ahh. Trading complexity in specification for complexity in management. xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From gtn at ebt.com Thu Mar 6 13:42:34 1997 From: gtn at ebt.com (Gavin Nicol) Date: Mon Jun 7 16:57:30 2004 Subject: Associating DSSSL style sheets with documents In-Reply-To: <2.2.32.19970306032432.00a75998@jclark.com> (message from James Clark on Thu, 06 Mar 1997 10:24:32 +0700) Message-ID: <199703061339.IAA06874@nathaniel.ebt> >Of course this doesn't completely solve the problem: we now have to figure >out how to associate a catalog with an SGML document. The latest SGML Open >draft requires (amongst other things) trying the URL/filename of the >document with any extension replaced by .soc. Reverse the problem as the MIME-SGML proposal did: send the catalog first, and all is well (we have this same discussion on XML-WG, and I agree that catalogs are insufficient in the general sense, but in the simple cases we're likely to see initially, they're fine). xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From gtn at ebt.com Thu Mar 6 13:44:03 1997 From: gtn at ebt.com (Gavin Nicol) Date: Mon Jun 7 16:57:31 2004 Subject: Associating DSSSL style sheets with documents In-Reply-To: <2.2.32.19970306032432.00a75998@jclark.com> (message from James Clark on Thu, 06 Mar 1997 10:24:32 +0700) Message-ID: <199703061341.IAA06880@nathaniel.ebt> >So if we are going to use a catalog entry (and I'm not yet convinced >this is the best solution) I would suggest having a simple DSSSL entry >which looks like: > >DSSSL spec.dsl The problem with this is that it is application specific. How do I tell a browser that DSSSL === DSSSL Stylesheet? xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From gtn at ebt.com Thu Mar 6 13:54:54 1997 From: gtn at ebt.com (Gavin Nicol) Date: Mon Jun 7 16:57:31 2004 Subject: Simple approaches to XML implementation In-Reply-To: <199703060427.MAA29885@ms7.hinet.net> (housel@ms7.hinet.net) Message-ID: <199703061352.IAA06888@nathaniel.ebt> >>class XMLParser { >>... >>parser(XMLEventHandler handler); >>... >>} > >That's one way of doing things. The main problem I see with this interface >is that there are quite a few possible methods (I count 71 classdefs in >the SGML property set, though of course not all of those are applicable to >XML), and it becomes difficult to expand the set of events. I use about 8 event handlers for most of my API's... >As much as possible, a good reusable component should not force the >user's hand when choosing what node to grab onto. As an example, >YACC is pretty bad about this. You supply it with a lexer (with a >fixed name) and a set of handlers to be called when productions are >reduced. The YACC-generated parser insists on being in charge. Sure. The important thing with is that if you want to query into a document, you have to have parsed at least as far as the nodes you want to access, and that haveing a tree representation for such cases makes it a *lot* easier. For cases where you "want to be in control", I would have the event handler be a grove constructor, and have the application work upon the grove. Note that accessing a grove, or querying a document is *different* to *parsing* a document. >1. An external entity manager, responsible for obtaining document > instances (the "start" document and others), DTD's, etc. from > local storage, the web, some database, etc. This should probably > be user-customizable. I'm not sure about this. In some ways, I cannot see the reason for *exposing* an entity manager, but then again, I can imagine an implementation without one either.... >2. An encoding manager, responsible for mapping one of the possible > XML document encodings (Latin-n, UTF-7, UTF-8, UCS-2, UTF-16, whatever) > onto ISO10646 characters. Streams... >3. The parser itself, responsible for turning characters into XML events, > and possibly into grove structures. Push grove building off to later stages. >[Browser] gives the most complicated parser, since it has to asynchronously >handle information from several different documents. > >[YACC] is the easiest to write, but it's less flexible. Given [Browser], >it's easy to write [YACC]. (Given [XMLEventStream] you can also derive >[YACC], but with greater overhead.) > >[XMLEventStream] and [Grove] give you the most flexibility with respect to >the grove plan. I think these confluge many different processing layers. >languages, but the only firm conclusion I've come to is that I really wish >I could use coroutines. Amen to that sentiment. xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From gtn at ebt.com Thu Mar 6 14:00:48 1997 From: gtn at ebt.com (Gavin Nicol) Date: Mon Jun 7 16:57:31 2004 Subject: Associating DSSSL style sheets with documents In-Reply-To: <199703060519.VAA00633@boethius.eng.sun.com> (bosak@atlantic-83.Eng.Sun.COM) Message-ID: <199703061358.IAA06890@nathaniel.ebt> | Content-Type: text/xml; stylesheet=foo.dsl | | This is only going to work in the context of HTTP. This is not going to work if you have multiple stylesheets associated with a document unless you use multipart MIME bodies, and then we're right back to MIME-SGML. I favor Don's proposal because it had the right semantics for both HTTP, amd email (ie. it could pull or push). | The type of the stylesheet could be indicated by its Content-Type, and | the client could use content-type negotiation to ensure it gets the | kind of stylesheet it can handle. Somebody should probably register a | MIME content-type for DSSSL style sheets as well. Content negotiation only takes you so far, and is HTTP specific *and* spottily implemented. It also only allows you to negotiate on the *type* of the resource, and a few other things. It does not help if you have multiple resources each of which are of equivalent quality *and* that the user may like to choose between. xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From gtn at ebt.com Thu Mar 6 14:09:34 1997 From: gtn at ebt.com (Gavin Nicol) Date: Mon Jun 7 16:57:31 2004 Subject: Which style first ? Re: Associating DSSSL style sheets with documents In-Reply-To: <331EF2DA.4165@edu.uni-klu.ac.at> (nmikula@edu.uni-klu.ac.at) Message-ID: <199703061406.JAA06898@nathaniel.ebt> >I see the following problem. I am working on a DSSSL document >that has one style for hardcopy and a variety of styles for >online rendering. How are we going to tell the DSSSL engine >what style to start with. Precisely what the SEMANTIC proposal for catalogs was supposed to take care of. We use this to good effect in DynaText... xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From gtn at ebt.com Thu Mar 6 14:10:49 1997 From: gtn at ebt.com (Gavin Nicol) Date: Mon Jun 7 16:57:31 2004 Subject: Associating DSSSL style sheets with documents In-Reply-To: <331EED8D.7312@edu.uni-klu.ac.at> (nmikula@edu.uni-klu.ac.at) Message-ID: <199703061408.JAA06900@nathaniel.ebt> >I do also think the idea with the mime header is worth a try. MIME is overkill for this, and in the context of HTTP, it is severely limited. Forget it. xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From cbullard at hiwaay.net Thu Mar 6 15:21:34 1997 From: cbullard at hiwaay.net (Len Bullard) Date: Mon Jun 7 16:57:31 2004 Subject: Associating DSSSL style sheets with documents References: <199703061336.IAA06872@nathaniel.ebt> Message-ID: <331EDE4D.166@hiwaay.net> Gavin Nicol wrote: > > >Yes and no. The problem with the FOSI was that even though it > >worked, it was hard to specify style on elements in context > >when the contexts were complex. We combine context and > >local stylesheets. So, a parentage can be used, but a local > >stylesheet can introduce a new one, so the complexity is > >localized as well. Conservation of complexity: we have > >more stylesheets to manage per instance. > > Ahh. Trading complexity in specification for complexity in management. Yes. Optional realities based on what requirement is most compelling in a given production management scenario. One can write a stylesheet for a complex DTD and get a complex stylesheet. One can write a stylesheet for a set of related DTDs and only have to write some exceptions. One can write multiple stylesheets that are called at different parts of the document as we do in IDE/AS and IADS and the styles are flexible. The compromise is keeping and managing multiple stylesheets per document class if one is smart enough to use DTDs for systems that don't require them. We've been through the "well-formed" approach down here. Good for light stuff, but for classes of documents used over lifecycles, nyet. But I'm content to let others bump their heads against that problem until they understand it. Back to stylesheets, ss in any compound class (one stylesheet, several DTDs) that varies like this, if it is also dynamic (that is, some part of the DTD is always changing), then that complex/compound stylesheet can become a real bear. This is particularly true in systems where one must share the stylesheet across organizational boundaries. The longer one looks at this, the more one starts to favor delivery of encapsulated objects and view the separation of process and data as a weirdly religious approach favored by those who do not manage large dynamic document collections for distributed presentation. This is why dual path, lobster-trap delivery systems are preferred by some organizations. For example, SGML for archival, PDF for presentation. So where we specify association of processing specifications to the document instance, if we make the programmer's job easy, we may make the information manager's job hard. Guess who buys the system? Encapulated objects give them both what they need on the front end. The price is paid ten years later when one has to rehost, recover, or repurpose. OTH, for information that does not live long, the encapsulated object is a good idea which is why I thought it strange that HTML is SGML. Since this is a manageement production problem, it can be solved by a production approach (e.g, the lobster trap). Look for the middle way. Catalogs and menu selections look promising. len xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From ht at cogsci.ed.ac.uk Thu Mar 6 15:54:32 1997 From: ht at cogsci.ed.ac.uk (Henry S. Thompson) Date: Mon Jun 7 16:57:31 2004 Subject: BNF Message-ID: <1440.199703061554@grogan.cogsci.ed.ac.uk> Has anyone extracted the XML BNF into a usable form? ht xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From tbray at textuality.com Thu Mar 6 16:13:47 1997 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 16:57:31 2004 Subject: BNF Message-ID: <3.0.32.19970306081217.00980960@pop.intergate.bc.ca> At 03:54 PM 3/6/97 GMT, Henry S. Thompson wrote: >Has anyone extracted the XML BNF into a usable form? Should take about 45 seconds, working from the XML source for the spec... there are a few pointed-out-but-as-yet-unfixed holes though. -T. xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From ht at cogsci.ed.ac.uk Thu Mar 6 16:33:03 1997 From: ht at cogsci.ed.ac.uk (Henry S. Thompson) Date: Mon Jun 7 16:57:31 2004 Subject: BNF In-Reply-To: Tim Bray's message of Thu, 06 Mar 1997 08:12:26 -0800 References: <3.0.32.19970306081217.00980960@pop.intergate.bc.ca> Message-ID: <1457.199703061632@grogan.cogsci.ed.ac.uk> Tim writes: > >Has anyone extracted the XML BNF into a usable form? > > Should take about 45 seconds, working from the XML source for the > spec... there are a few pointed-out-but-as-yet-unfixed holes though. -T. Is that an offer? Or should I know where the XML source is? ht xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From dgd at cs.bu.edu Thu Mar 6 16:57:29 1997 From: dgd at cs.bu.edu (David Durand) Date: Mon Jun 7 16:57:31 2004 Subject: API thoughts... In-Reply-To: Message-ID: At 2:14 PM -0800 3/5/97, Bill Smith wrote: >> Well, the Lark event-stream API sure looks & feels like a bunch of >> callbacks. You make a Lark object, call its readXML() method with one >> argument being a Handler object; Handler being a data-less class that >> just has a bunch of methods called things like doPI() and doStartTag() and >> doEntityReference() and doText() and so on; you'd normally subclass Handler >> replacing the methods for the events you wanted to see, and pass in >> that kind of object. Lark calls these upon recognizing >> the constructs in the input stream, passing the byte offset info, the >> element & entity stack (*if* you're treebuilding), and other currently >> relevant info. These methods are all booleans; if any returns true, >> Lark stops and returns control to whoever called readXML(). I like this boolean approach -- it lets the Handler object take back the flow of control, pretty easily. If Lark could break PCDATA up when the buffer stalls, you could easily implement a Browser-style application. >Another way to do this is to have the Lark object (or interface) define the >event methods rather than have a separate Handler object. When it's time to >parse something, create a subclass that overrides the (standard) event methods >for the Lark object. I don't like this quite as well for a generic API as I can see the use of Handler objects that don't know how to parse -- they can be glued to other event sources to run off of DB engines -- or even broken across a network to provide and XML event-stream mechanism... >A possible advantage to this method is that it makes clear the inheritance >relationship between the "standard" parser and something more specific. It >is also "easier" to create a more specific parser from an exisiting parser >object - simply subclass the existing parser and override the methods >required to provide the desired new functionality. If the methods in the standard parser don't do something you are interested in, you still have to do override them all -- and I don't see what default behavior would make sense other than "do nothing". It seems that you could get the benefits of having that simply by supplying a predefined Handler object that has null implementations for its methods. >The Lark model "hides" the inheritance relationship in the Handler object >making it necessary to look inside a Lark object to determine the type of >a given parser (something you might need to do when debugging). An >alternative is to create a new parser object that contains a subclassed event >handler. This makes it possible to distinguish the type of parser at the >"outer" level but requires two new objects instead of one to perform the >subclass. The debugging issue is certanly a bit inconvenient. If we use interfaces rather than classes for the API (almost certainly a good idea), then we can certainly create a Parser that implements Handler. >I'm not a parser expert so the subclass model may not make any sense but this >is a mechanism I have successfully used building other object-oriented >(including GUI-based) systems. I have also used callbacks but find them most >useful when forced to use C or other non-object-based languages. I think that subclassing here just means that I might be forced to pull in parser baggage (or null methods) when I want to implement a parser-free event handler or event generator. -- David _________________________________________ David Durand dgd@cs.bu.edu \ david@dynamicDiagrams.com Boston University Computer Science \ Sr. Analyst http://www.cs.bu.edu/students/grads/dgd/ \ Dynamic Diagrams --------------------------------------------\ http://dynamicDiagrams.com/ MAPA: mapping for the WWW \__________________________ xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From jjc at jclark.com Fri Mar 7 05:10:44 1997 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 16:57:31 2004 Subject: Associating DSSSL style sheets with documents Message-ID: <2.2.32.19970307050036.00a8a6f8@jclark.com> At 08:41 06/03/97 -0500, Gavin Nicol wrote: >>So if we are going to use a catalog entry (and I'm not yet convinced >>this is the best solution) I would suggest having a simple DSSSL entry >>which looks like: >> >>DSSSL spec.dsl > >The problem with this is that it is application specific. How do I >tell a browser that DSSSL === DSSSL Stylesheet? I don't see your point at all. Why do you need to tell the browser? The browser knows that the DSSSL keyword in the catalog designates a DSSSL specification (it could include transformation specs as well as stylesheets) the same way it knows that the SGMLDECL entry designates an SGML specification, and the same way it would know that a type of dsssl-specification in a SEMANTICS entry SEMANTICS "dsssl-specification" "Stylesheet title" foo.dsl means DSSSL specification. James xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From nmikula at edu.uni-klu.ac.at Fri Mar 7 08:05:36 1997 From: nmikula at edu.uni-klu.ac.at (Norbert H. Mikula) Date: Mon Jun 7 16:57:31 2004 Subject: Which style first ? Re: Associating DSSSL style sheets with documents References: <2.2.32.19970306122748.00ad4c24@jclark.com> Message-ID: <33204BB2.C96@edu.uni-klu.ac.at> James Clark wrote: > Where the content provider knows which style they want, they can use a > fragment spec in the URL to pick out a particular spec from the document. The scenario I have in mind is putting documents onto a WWW server. In this case the content provider can not know whether the user (-agent) downloads for online-rendering, hardcopy or a mixture thereof. > >What about adding an attribute with a list of catagories > >of DSSSL engines as possible attribute values. > > > >For instance : output (hardcopy|online|....?) hardcopy > > What other categories are there? If a style sheet is for online use, then > it has to use the scroll flow object, which means it ought to list the > online feature in the features element type form. Yes, that is an interesting approach. However, if we have multiple style-specs. in one document, for instance hardcopy and on-line. Would it not be the case, that you would find the online feature in the features element type form regardless of what stylespec. the user agent is really interested in ? I don't have the DSSSL specs at hand right now. So I hope it makes sense what I am saying. If not -> (element my_comment (empty-sosofo)) ;-) > Maybe a DSSSL engine > could use this, or maybe it could look to see which flow object classes the > spec uses. I don't think that this is a practical approach. If a style-spec. is supposed to be used for online-rendering, is it really a must to use scroll-fo ? What about simple-page-seq. ? I know that this is not something somebody would intuitively do, but why not. I could envision a browser that uses simple-page-seq. For large document instances the browser takes advantage of the explicit information in the document instances combined with the style-spec. In other words, chapter starts a new page. The browser would have to render only the active page. I guess it might also foster more reusable style modules. It remains to be discussed how the document should be divided into more Internet suitable sizes and how the user agent would/should deal with this and to what extend it has/should have/must have an influence on the design of stylesheets. I especially have in mind the case of XML entity treatment. Makes sense ? However, I can see no point in scanning the whole document just to deduce somehow what class of style-spec. it belongs to. It sounds to me like trying to figure out the semantics of a paragraph without using markup. -- Best regards, Norbert H. Mikula ===================================================== = SGML, DSSSL, Intra- & Internet, AI, Java ===================================================== = mailto:nmikula@edu.uni-klu.ac.at = http://www.edu.uni-klu.ac.at/~nmikula ===================================================== xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From ksaito at flab.fujitsu.co.jp Fri Mar 7 11:26:37 1997 From: ksaito at flab.fujitsu.co.jp (ksaito@flab.fujitsu.co.jp) Date: Mon Jun 7 16:57:31 2004 Subject: Which style first ? Re: Associating DSSSL style sheets with documents In-Reply-To: <33204BB2.C96@edu.uni-klu.ac.at> Message-ID: <9703071123.AA01303@sanma.flab.fujitsu.co.jp> We have an idea about association too. Now we are implementing it in our DSSSL System and testing. Please suggest about this idea if you can. Following is description. -- Part.1 Basic concept This idea is declear style-sheet as external entity and never refer it in SGML document. And gives that entity notation which indicate style-sheet. ex) In SGML document prolog. In this exapmle, style-sheet is described with DSSSL notation and that is identified by sytem identifier "style-sheet.dsl". ( this DSSSL notation identifier is virtual). Application recognizes style-sheet by following steps. 1) checks declared entieies. 2) checks notion of these external entities. 3) if some entities have notation which means DSSSL style-sheet, then that application uses these external entity as style-sheet. That SGML document don't refer style-sheet external entities. Therefore, old application never expand content of style-sheet in document. I think, this way will not violate SGML standard. -- Part.2 Associate multiple style-sheets. To relate more than one style-sheet to single SGML document, prepare these style-sheets as external entities respectively. And declares these entities as following example. ex.1) About DSSSL, to decide which style-sheet is used, application refers "desc" attribute of transformation-specification or style-specification element. Otherwise entity attribute is useful for selecting style-sheet. ex.2) Since DSSSL can create multiple style-sheet as single file, DSSSL does not need this method. But this is useful for other style-sheet language. -- Part.3 Way of associate different language style-sheets. To relate more than one style-sheets which were described with different notations to single document, prepare these style-sheets as external entities respectively. And declare these as following example. ex) In this example, to decide which style-sheet is used, application refers entities's notations. If the application supports DSSSL only, then use "style-sheet.dsl", otherwise supports CSS only then use "style-sheet.css". Application can select style-sheet by its notation. --- KAZUMI Saito Fujitsu Laboratories Ltd. Information Service Architecture Lab. xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From gtn at ebt.com Fri Mar 7 13:52:03 1997 From: gtn at ebt.com (Gavin Nicol) Date: Mon Jun 7 16:57:31 2004 Subject: Associating DSSSL style sheets with documents In-Reply-To: <2.2.32.19970307050036.00a8a6f8@jclark.com> (message from James Clark on Fri, 07 Mar 1997 12:00:36 +0700) Message-ID: <199703071349.IAA07589@nathaniel.ebt> >>>So if we are going to use a catalog entry (and I'm not yet convinced >>>this is the best solution) I would suggest having a simple DSSSL entry >>>which looks like: >>> >>>DSSSL spec.dsl >> >>The problem with this is that it is application specific. How do I >>tell a browser that DSSSL === DSSSL Stylesheet? > >I don't see your point at all. Why do you need to tell the browser? The >browser knows that the DSSSL keyword in the catalog designates a DSSSL >specification (it could include transformation specs as well as stylesheets) >the same way it knows that the SGMLDECL entry designates an SGML >specification, and the same way it would know that a type of >dsssl-specification in a SEMANTICS entry My point is that you are proposing adding an extension that is not standardised or generalised. Given that we all agree that a DSSSL keyword should be supported, there isn't much of a problem, though I would prefer adding some descriptive label to it. SEMANTICS is a more general solution: you could use it for CSS or whatever else you wanted/supported. xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From ht at cogsci.ed.ac.uk Fri Mar 7 16:41:05 1997 From: ht at cogsci.ed.ac.uk (Henry S. Thompson) Date: Mon Jun 7 16:57:32 2004 Subject: XML BNF Message-ID: <1670.199703071640@grogan.cogsci.ed.ac.uk> This was taken semi-automatically from the XML source for the spec document treated as SGML. Tim's version thereof had some bogus PC = codes in it, which I THINK I've removed. There is at least one real (i.e. in the original) bug I've noticed so far -- there should be a * after the second-last ) in the DOCTYPE disjunct of the Markup production. Use at your own risk. ht ------------- S ::= (#x0020 | #x000a | #x000d | #x0009 | #x3000)+ Character ::= #x09 | #x0A | #x0D | [#x20-#xFFFD] | [#x00010000-#x7FFFFFFF] /* any ISO 10646 31-bit code, FFFE and FFFF excluded */ BaseChar ::= [#x41-#x5A] | [#x61-#x7A] /* Latin 1 upper and lowercase */ | #xAA | #xB5 | #xBA | [#xC0-#xD6] | [#xD8-#xF6] /* Latin 1 supplementary */ | [#xF8-#xFF] /* Latin 1 supplementary */ | [#x0100-#x017F] /* Extended Latin-A */ | [#x0180-#x01F5] | [#x01FA-#x0217] /* Extended Latin-B */ | [#x0250-#x02A8] /* IPA Extensions */ | [#x02B0-#x02B8] | [#x02BB-#x02C1] | [#x02E0-#x02E4] /* Spacing Modifiers */ | #x037A | #x0386 | [#x0388-#x038A] | #x038C | [#x038E-#x03A1] | [#x03A3-#x03CE] | [#x03D0-#x03D6] | #x03DA | #x03DC | #x03DE | #x03E0 | [#x03E2-#x03F3] /* Greek and Coptic */ | [#x0401-#x040C] | [#x040E-#x044F] | [#x0451-#x045C] | [#x045E-#x0481] | [#x0490-#x04C4] | [#x04C7-#x04C8] | [#x04CB-#x04CC] | [#x04D0-#x04EB] | [#x04EE-#x04F5] | [#x04F8-#x04F9] /* Cyrillic */ | [#x0531-#x0556] | [#x0559-#x055A] | [#x0561-#x0587] /* Armenian */ | [#x05D0-#x05EA] | [#x05F0-#x05F2] /* Hebrew */ | [#x0621-#x063A] | [#x0641-#x064A] | [#x0671-#x06B7] | [#x06BA-#x06BE] | [#x06C0-#x06CE] | [#x06D0-#x06D3] | [#x06D5-#x06D6] | [#x06E5-#x06E6] /* Arabic */ | [#x0905-#x0939] | #x093D | [#x0958-#x0961] /* Devanagari */ | #x0981 | [#x0985-#x098C] | [#x098F-#x0990] | [#x0993-#x09A8] | [#x09AA-#x09B0] | #x09B2 | [#x09B6-#x09B9] | [#x09DC-#x09DD] | [#x09DF-#x09E1] | [#x09F0-#x09F1] /* Bengali */ | [#x0A05-#x0A0A] | [#x0A0F-#x0A10] | [#x0A13-#x0A28] | [#x0A2A-#x0A30] | [#x0A32-#x0A33] | [#x0A35-#x0A36] | [#x0A38-#x0A39] /* Gurmukhi */ | [#x0A8F-#x0A91] | [#x0A93-#x0AA8] | [#x0AAA-#x0AB0] | [#x0AB2-#x0AB3] | [#x0AB5-#x0AB9] | #x0AE0 /* Gujarati */ | [#x0B05-#x0B0C] | [#x0B0F-#x0B10] | [#x0B13-#x0B28] | [#x0B2A-#x0B30] | [#x0B32-#x0B33] | [#x0B36-#x0B39] | #x0B3D | [#x0B5C-#x0B5D] | [#x0B5F-#x0B61] /* Oriya */ | [#x0B85-#x0B8A] | [#x0B8E-#x0B90] | [#x0B92-#x0B95] | [#x0B99-#x0B9A] | #x0B9C | [#x0B9E-#x0B9F] | [#x0BA3-#x0BA4] | [#x0BA8-#x0BAA] | [#x0BAE-#x0BB5] | [#x0BB7-#x0BB9] /* Tamil */ | [#x0C05-#x0C0C] | [#x0C0E-#x0C10] | [#x0C12-#x0C28] | [#x0C2A-#x0C33] | [#x0C35-#x0C39] | [#x0C60-#x0C61] /* Telugu */ | [#x0C85-#x0C8C] | [#x0C8E-#x0C90] | [#x0C92-#x0CA8] | [#x0CAA-#x0CB3] | [#x0CB5-#x0CB9] | #x0CDE | [#x0CE0-#x0CE1] /* Kannada */ | [#x0D05-#x0D0C] | [#x0D0E-#x0D10] | [#x0D12-#x0D28] | [#x0D2A-#x0D39] | [#x0D60-#x0D61] /* Malayalam */ | [#x0E01-#x0E2E] | #x0E30 | [#x0E32-#x0E33] | [#x0E40-#x0E45] /* Thai */ | [#x0E81-#x0E82] | #x0E84 | [#x0E87-#x0E88] | #x0E8A | #x0E8D | [#x0E94-#x0E97] | [#x0E99-#x0E9F] | [#x0EA1-#x0EA3] | #x0EA5 | #x0EA7 | [#x0EAA-#x0EAB] | [#x0EAD-#x0EAE] | #x0EB0 | [#x0EB2-#x0EB3] | #x0EBD | [#x0EC0-#x0EC4] | [#x0EDC-#x0EDD] /* Lao */ | [#x0F40-#x0F47] | [#x0F49-#x0F69] /* Tibetan */ | [#x10A0-#x10C5] | [#x10D0-#x10F6] /* Georgian */ | [#x1100-#x1159] | [#x115F-#x11A2] | [#x11A8-#x11F9] /* Hangul Jamo */ | [#x1E00-#x1E9B] | [#x1EA0-#x1EF9] /* Add'l Extended Latin */ | [#x1F00-#x1F15] | [#x1F18-#x1F1D] | [#x1F20-#x1F45] | [#x1F48-#x1F4D] | [#x1F50-#x1F57] | #x1F59 | #x1F5B | #x1F5D | [#x1F5F-#x1F7D] | [#x1F80-#x1FB4] | [#x1FB6-#x1FBC] | #x1FBE | [#x1FC2-#x1FC4] | [#x1FC6-#x1FCC] | [#x1FD0-#x1FD3] | [#x1FD6-#x1FDB] | [#x1FE0-#x1FEC] | [#x1FF2-#x1FF4] | [#x1FF6-#x1FFC] /* Greek Extensions */ | #x207F /* Super-, subscripts */ | #x2102 | #x2107 | [#x210A-#x2113] | #x2115 | [#x2118-#x211D] | #x2124 | #x2126 | #x2128 | [#x212A-#x212D] | [#x212F-#x2131] | [#x2133-#x2138] /* Letterlike Symbols */ | [#x2160-#x2182] /* Number forms */ | [#x3041-#x3094] /* Hiragana */ | [#x30A1-#x30FA] /* Katakana */ | [#x3105-#x312C] /* Bopomofo */ | [#x3131-#x318E] /* Hangul Jamo */ | [#xAC00-#xD7A3] | [#xFB00-#xFB06] | [#xFB13-#xFB17] | [#xFB1F-#xFB28] | [#xFB2A-#xFB36] | [#xFB38-#xFB3C] | #xFB3E | [#xFB40-#xFB41] | [#xFB43-#xFB44] | [#xFB46-#xFB4F] /* Alphabetic presentation forms */ | [#xFB50-#xFBB1] | [#xFBD3-#xFD3D] | [#xFD50-#xFD8F] | [#xFD92-#xFDC7] | [#xFDF0-#xFDF8] | [#xFE70-#xFE72] | #xFE74 | [#xFE76-#xFEFC] /* Arabic presentation forms */ | [#xFF21-#xFF3A] | [#xFF41-#xFF5A] | [#xFF66-#xFF6F] | [#xFE71-#xFF9D] | [#xFFA0-#xFFBE] | [#xFFC2-#xFFC7] | [#xFFCA-#xFFCF] | [#xFFD2-#xFFD7] | [#xFFDA-#xFFDC] /* Half- and fullwidth forms */ Ideographic ::= [#x4E00-#x9FA5] | [#xF900-#xFA2D] | #x3007 | [#x3021-#x3029] CombiningChar ::= [#x0300-#x0361] | [#x0483-#x0486] | [#x0591-#x05C4] | [#x064B-#x0652] | #x0670 | [#x06D7-#x06DC] | [#x06DD-#x06DF] | [#x06E0-#x06E4] | [#x06E7-#x06E8] | [#x06EA-#x06ED] | [#x0901-#x0903] | [#x093E-#x094C] | #x094D | [#x0951-#x0954] | [#x0962-#x0963] | [#x0981-#x0983] | #x09BC | #x09BE | #x09BF | [#x09C0-#x09C4] | [#x09C7-#x09C8] | [#x09CB-#x09CD] | #x09D7 | [#x09E2-#x09E3] | #x0A02 | #x0A3C | #x0A3E | #x0A3F | [#x0A40-#x0A42] | [#x0A47-#x0A48] | [#x0A4B-#x0A4D] | [#x0A70-#x0A71] | [#x0A81-#x0A83] | #x0ABC | [#x0ABE-#x0AC5] | [#x0AC7-#x0AC9] | #x0ACB | #x0ACC | [#x0B01-#x0B03] | #x0B3C | [#x0B3E-#x0B43] | [#x0B47-#x0B48] | [#x0B4B-#x0B4C] | [#x0B56-#x0B57] | [#x0B82-#x0B83] | [#x0BBE-#x0BC2] | [#x0BC6-#x0BC8] | [#x0BCA-#x0BCC] | #x0BD7 | [#x0C01-#x0C03] | [#x0C3E-#x0C44] | [#x0C46-#x0C48] | [#x0C4A-#x0C4D] | [#x0C55-#x0C56] | [#x0C82-#x0C83] | [#x0CBE-#x0CC4] | [#x0CC6-#x0CC8] | [#x0CCA-#x0CCC] | [#x0CD5-#x0CD6] | [#x0D02-#x0D03] | [#x0D3E-#x0D43] | [#x0D46-! #x0D48] | [#x0D4A-#x0D4C] | #x0D57 | #x0E31 | [#x0E34-#x0E3A] | [#x0E47-#x0E4E] | #x0EB1 | [#x0EB4-#x0EB9] | [#x0EBB-#x0EBC] | [#x0EC8-#x0ECD] | [#x0F18-#x0F19] | #x0F35 | #x0F37 | #x0F39 | #x0F3E | #x0F3F | [#x0F71-#x0F84] | [#x0F86-#x0F8B] | [#x0F90-#x0F95] | #x0F97 | [#x0F99-#x0FAD] | [#x0FB1-#x0FB7] | #x0FB9 | [#x20D0-#x20DC] | #x20E1 | [#x302A-#x302F] | #x3099 | #x309A | #xFB1E | [#xFE20-#xFE23] Letter ::= (BaseChar CombiningChar*) | Ideographic Digit ::= [#x0030-#x0039] /* ISO 646 digits */ | [#x0660-#x0669] /* Arabic-Indic digits */ | [#x06F0-#x06F9] /* Eastern Arabic-Indic digits */ | [#x0966-#x096F] /* Devanagari digits */ | [#x09E6-#x09EF] /* Bengali digits */ | [#x0A66-#x0A6F] /* Gurmukhi digits */ | [#x0AE6-#x0AEF] /* Gujarati digits */ | [#x0B66-#x0B6F] /* Oriya digits */ | [#x0BE7-#x0BEF] /* Tamil digits (no zero) */ | [#x0C66-#x0C6F] /* Telugu digits */ | [#x0CE6-#x0CEF] /* Kannada digits */ | [#x0D66-#x0D6F] /* Malayalam digits */ | [#x0E50-#x0E59] /* Thai digits */ | [#x0ED0-#x0ED9] /* Lao digits */ | [#x0F20-#x0F29] /* Tibetan digits */ | [#xFF10-#xFF19] /* Fullwidth digits */ Ignorable ::= [#x200C-#x200F] /* zw layout */ | [#x202A-#x202E] /* bidi formatting */ | [#x206A-#x206F] /* alt formatting */ | #xFEFF /* zw nonbreak space */ Extender ::= #x00B7 | #x02D0 | #x02D1 | #x0387 | #x0640 | #x0E46 | #x0EC6 | #x3005 | [#x3031-#x3035] | [#x309B-#x309E] | [#x30FC-#x30FE] | #xFF70 | #xFF9E | #xFF9F MiscName ::= '.' | Ignorable | Extender NameChar ::= Letter | Digit | MiscName Name ::= (Letter | '-') (NameChar)* Nmtoken ::= (NameChar)+ Nmtokens ::= Nmtoken (S Nmtoken)* Literal ::= '"' ([^"] | PEReference | CharRef)* '"' | "'" ([^'] | PEReference | CharRef)* "'" QuotedCData ::= '"' ([^"<] | Reference)* '"' | "'" ([^'<] | Reference)* "'" Trivial ::= (PCData | Markup)* Eq ::= S? '=' S? Markup ::= '<' Name (S Name Eq QuotedCData)* S? '>' /* start-tags */ | '' /* end-tags */ | '<' Name (S Name Eq QuotedCData)* S? '/>' /* empty elements */ | '&' Name ';' /* entity references */ | '&#' [0-9]+ ';' /* character references */ | '&u-' Hex4 ';' /* character references */ | '' /* comments */ | '' /* CDATA sections */ | '') | ('"' [^"]* '"') | ("'" [^']* "'") | conditionalSect | [^]]* ) ']')? '>' /* doc type declaration */ | ']+)* '?>' /* processing instructions */ PCData ::= [^<&]* Comment ::= '' PI ::= ']+)* '?>' CDSect ::= CDStart CData CDEnd CDStart ::= '])) [^]]*)* CDEnd ::= ']]>' document ::= Prolog element Misc* Prolog ::= XMLDecl Misc* doctypedecl? Misc* XMLDecl ::= '' VersionInfo ::= S 'version' Eq ('"1.0"' | "'1.0'") Misc ::= Comment | PI | S doctypedecl ::= '' internalsubset ::= elementdecl | AttlistDecl | EntityDecl | NotationDecl | PEReference | conditionalSect | PI | S | Comment RMDecl ::= 'RMD' Eq ('NONE' | 'INTERNAL' | 'ALL') STag ::= '<' Name (S Attribute)* S? '>' Attribute ::= Name Eq QuotedCData ETag ::= '' EmptyElement ::= '<' Name (S Attribute)* S? '/>'; content ::= (element | PCData | Reference | CDSect | PI | Comment)* element ::= EmptyElement /* empty elements */ | STag content ETag elementdecl ::= '' Mixed ::= '(' S? '#PCDATA' ( S? '|' S? Name )* S? ')*' | '(' S? '#PCDATA' S? ')' elements ::= (choice | seq) ('?' | '*' | '+')? cp ::= (Name | choice | seq) ('?' | '*' | '+')? cps ::= S? cp S? choice ::= '(' cps ('|' cps)+ ')' seq ::= '(' cps (',' cps)* ')' AttlistDecl ::= '' AttDef ::= S Name S AttType S Default AttType ::= StringType | TokenizedType | EnumeratedType StringType ::= 'CDATA' TokenizedType ::= 'ID' EnumeratedType ::= NotationType | Enumeration NotationType ::= 'NOTATION' S '(' S? Name (S? '|' S? Name)* S? ')' Enumeration ::= '(' S? Nmtoken (S? '|' S? Nmtoken)* S? ')' Default ::= '#REQUIRED' | '#IMPLIED' | ('#FIXED'? QuotedCData) conditionalSect ::= '' CSKey ::= PEReference | 'INCLUDE' | 'IGNORE' csdata ::= internalsubset Hex ::= [0-9a-fA-F] Hex4 ::= Hex Hex Hex Hex CharRef ::= '&#' [0-9]+ ';' | '&u-' Hex4 ';' Reference ::= EntityRef | CharRef EntityRef ::= '&' Name ';' PEReference ::= '%' Name ';' EntityDecl ::= '' /* General entities */ | '' /* Parameter entities */ EntityDef ::= Literal | ExternalDef; ExternalDef ::= ExternalID NDataDecl? ExternalID ::= 'SYSTEM' S SystemLiteral SystemLiteral ::= '"' [^"]* '"' | "'" [^']* "'" NDataDecl ::= S 'NDATA' S Name EncodingDecl ::= S 'encoding' Eq QEncoding EncodingPI ::= '' QEncoding ::= '"' Encoding '"' | "'" Encoding "'" Encoding ::= LatinName LatinName ::= [A-Za-z] ([A-Za-z0-9] | '-' | '.')* /* Name comprising only Latin characters */ NotationDecl ::= '' xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From tbray at textuality.com Fri Mar 7 17:23:20 1997 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 16:57:32 2004 Subject: XML BNF Message-ID: <3.0.32.19970307092116.0098bd70@pop.intergate.bc.ca> At 04:40 PM 3/7/97 GMT, Henry S. Thompson wrote: >This was taken semi-automatically from the XML source for the spec >document treated as SGML. Thanks Henry - should be useful. Don't throw those scripts away, because I expect this current WG8 process (ANSI is meeting as I type this) to add to the already significant list of changes built-up for a new rev. What Henry called "bogus PC codes" are perfectly legit ISO nonbreaking spaces I believe, put in by Jon when he was wrestling with Jade to pretty up the produtions. Anyhow, they'll go in the next rev. -Tim xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From bosak at atlantic-83.Eng.Sun.COM Sat Mar 8 04:28:29 1997 From: bosak at atlantic-83.Eng.Sun.COM (Jon Bosak) Date: Mon Jun 7 16:57:32 2004 Subject: WWW6 demo change Message-ID: <199703080427.UAA02628@boethius.eng.sun.com> In light of the fact that there will be an all-day Developer's Day session on XML/SGML/DSSSL at the World Wide Web Conference, Tim Bray and I (who are in effect coordinating this event) have decided to move the demo session formerly scheduled for Thursday evening, April 10, to Friday afternoon, April 11. The subject of the demos fits in very nicely with the rest of the session and will give the people presenting demos a larger audience. As before, I encourage anyone who has a Web-related XML or DSSSL tool to demonstrate to get in touch with me if you haven't done so already. Jon ---------------------------------------------------------------------- Jon Bosak, Online Information Technology Architect, Sun Microsystems ---------------------------------------------------------------------- 2550 Garcia Ave., MPK17-101, Mountain View, California 94043 Davenport Group::SGML Open::NCITS V1::ISO/IEC JTC1/SC18/WG8::W3C XML If a man look sharply and attentively, he shall see Fortune; for though she be blind, yet she is not invisible. -- Francis Bacon ---------------------------------------------------------------------- xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From nmikula at edu.uni-klu.ac.at Sat Mar 8 08:57:15 1997 From: nmikula at edu.uni-klu.ac.at (Norbert Mikula) Date: Mon Jun 7 16:57:32 2004 Subject: NXP : New Beta (public identifiers, catalog, more parameter entities) Message-ID: To all (potential) users of NXP I have put a new beta release onto my WWW server. Please have a look at : http://www.edu.uni-klu.ac.at/~nmikula/NXP/beta Release Notes : * Includes Public Identifiers * Includes catalogs (incl. DELEGATE and CATALOG) (see http://www.edu.uni-klu.ac.at/~nmikula/NXP/beta/catalog.html) * Parameter Entitities More places where paramater entities can be used in the internal subset (please look at entities.xml and entities.dtd an the same directory). Im still working on this, but *please* do some torture testing with it and send me the results. I am especially interested in the cases that I think should work, but where one additional whitespace or so make the parser fail. * Name conflicts Conflicts of SGML keywords and SGML names should be all solved now. I guess I will rewrite the handling of all this by introducing more lexical states. * Attribute defaults should be handled correctly now Have fun :-) !!!!! Special thanks to all of you which have been testing and contributing to NXP so far. !!!!!! FYI: To re-compile you need the last JavaCC. Could someone take the burden and do some testing on the validation feature ( -v ). Please .... (see also : http://www.edu.uni-klu.ac.at/~nmikula/NXP/README.html) Best regards, Norbert H. Mikula ===================================================== = SGML, DSSSL, Intra- & Internet, AI, Java ===================================================== = mailto:nmikula@edu.uni-klu.ac.at = http://www.edu.uni-klu.ac.at/~nmikula ===================================================== xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From Peter at ursus.demon.co.uk Mon Mar 10 01:19:13 1997 From: Peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:57:32 2004 Subject: NXP : New Beta (public identifiers, catalog, more parameter entities) Message-ID: <4452@ursus.demon.co.uk> In message Norbert Mikula writes: > To all (potential) users of NXP > > I have put a new beta release onto my WWW server. > > Please have a look at : > > http://www.edu.uni-klu.ac.at/~nmikula/NXP/beta > > Release Notes : > > * Includes Public Identifiers > > * Includes catalogs (incl. DELEGATE and CATALOG) > (see http://www.edu.uni-klu.ac.at/~nmikula/NXP/beta/catalog.html) > > * Parameter Entitities > > More places where paramater entities can be used in the > internal subset (please look at entities.xml and entities.dtd > an the same directory). Im still working on this, but *please* > do some torture testing with it and send me the results. I am Norbert, I have been trying to get HTML2.0 to 'compile' under NXP. I have changed the required things: - -- -- to --* *-- - %bar; This appears to be OK under sgmls, but in NXP I think I have to write (But I confess this syntax confuses me terribly and I may have got this wrong :-) I am sorry I can't send you examples, but the basic problem is to get HTML2.0 to work with as little editing as possible. [...] > Could someone take the burden and do some testing > on the validation feature ( -v ). Please .... I am aiming to do this as soon as the DTDs read in OK :-) [...] -- Peter Murray-Rust, domestic net connection Virtual School of Molecular Sciences http://www.vsms.nottingham.ac.uk/ xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From Peter at ursus.demon.co.uk Mon Mar 10 01:51:48 1997 From: Peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:57:32 2004 Subject: XML QuotedCData question Message-ID: <4455@ursus.demon.co.uk> This article was posted on comp.text.sgml. I have replied there in general terms, but think there is sufficient ambiguity in the draft that discussion here may be useful. (I am struggling over parameter substitution at present). P. In article <3322BBE5.1E5C@east.sun.com> eric.baatz@east.sun.com "Eric Baatz - Sun Microsystems Labs BOS" writes: > I've been looking over the 14-Nov-96 XML working draft and > > 1. I don't understand why XML's QuotedCData seems to allow > constructs that look like references but aren't. (I am > assuming that such constructs would make life difficult > for parsers.) For example: > > "&fooref;" > > seems to be legal by applying the [^"<] part of the following > eight times. > > QuotedCData := '"' ([^"<] | Reference)* '"' ... > > Is the XML draft not stating some restriction, such as "if it > looks like a reference, it must be a reference"? > > 2. Is there a better place (perhaps more specific to XML) for me > to post XML queries such as #1? > > Thank you in advance for any help. > > > Eric Baatz > Sun Microsystems Laboratories > 2 Elizabeth Drive, MS UCHL03-207 (508) 442-0257 > Chelmsford, MA 01824 fax: (508) 250-5067 > USA Internet: eric.baatz@east.sun.com > -- Peter Murray-Rust, domestic net connection Virtual School of Molecular Sciences http://www.vsms.nottingham.ac.uk/ xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From jjc at jclark.com Mon Mar 10 03:22:58 1997 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 16:57:32 2004 Subject: Which style first ? Re: Associating DSSSL style sheets with documents Message-ID: <2.2.32.19970310031233.00a601c0@jclark.com> At 20:23 07/03/97 +0900, ksaito@flab.fujitsu.co.jp wrote: > > We have an idea about association too. Now we are implementing >it in our DSSSL System and testing. > > Please suggest about this idea if you can. Following is >description. > > >-- Part.1 Basic concept > > This idea is declear style-sheet as external entity and never >refer it in SGML document. And gives that entity notation which >indicate style-sheet. > >ex) In SGML document prolog. > > "ISO 10179-1996//NOTATION > Document Style Semantics and Specification Language//EN"> > > In this exapmle, style-sheet is described with DSSSL notation >and that is identified by sytem identifier "style-sheet.dsl". >( this DSSSL notation identifier is virtual). > > Application recognizes style-sheet by following steps. > 1) checks declared entieies. > 2) checks notion of these external entities. > 3) if some entities have notation which means DSSSL style-sheet, > then that application uses these external entity as style-sheet. I don't think it's safe to assume that an entity is intended to be be used as a style sheet for some document simply because it is declared in the document with a style sheet notation. Suppose for example, somebody was writing a book about DSSSL: they might declare each of their example style sheets as being entities with the DSSSL notation. Another possibility is that they might be declaring the entity with DSSSL notation so that they could specify it as the style sheet to be used for some other document that they refer it. James xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From lee at sq.com Mon Mar 10 03:30:14 1997 From: lee at sq.com (lee@sq.com) Date: Mon Jun 7 16:57:32 2004 Subject: XML QuotedCData question Message-ID: <9703100330.AA04839@sqrex.sq.com> The question about how to expand entities may arise, I think, because XML, like SGML, is not layered. Most programming languages talk explicitly about tokenisation, or tokenization if you prefer :-), and in doing so explain how the sequence of tokens that a compiler (say) sees is derived from an input stream. Usually, comments are stripped at this stage, and in languages such as C or SGML that have (in effect) macros, the macros are expanded at input time. I'd personally like to see a version of the XML spec in which there was no S production, but rather a list of things that are self-delimiting (such as <) and don't require whitespace; the explanation about entities would then be clearer. SGML entities can't all be expanded at input time, since some of them are of differing types (e.g. external files) and must be treated differently. I'm not sure whether this applies to XML general entities or not, but it probably does -- do we have NDATA entities? Maybe when the syntax settles down finally I'll do that. Lee xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From lee at sq.com Mon Mar 10 03:38:39 1997 From: lee at sq.com (lee@sq.com) Date: Mon Jun 7 16:57:32 2004 Subject: Which style first ? Re: Associating DSSSL style sheets with documents Message-ID: <9703100338.AA04880@sqrex.sq.com> James wrote: > I don't think it's safe to assume that an entity is intended to be be used > as a style sheet for some document simply because it is declared in the > document with a style sheet notation. Suppose for example, somebody was > writing a book about DSSSL: they might declare each of their example style > sheets as being entities with the DSSSL notation. [...] This highlights a weakness, I think, present in XML; it's also present in a different way in the WWW. We don't clearly distinguish between the type of an object (e.g. its data format, as determined by NOTATION or Mime Media Type) and our intended use of it. In HTML the link context -- IMG, A, META, LINK -- determines to some extent how the "remote" resource is to be used, but if a browser followes an -style link, the resulting action is determined almost entirely by the MIME media type that's discovered. I say almost, since Netscape Navigator has the target=(window/frame name) mechanism to give someattempt at control. Presumably (I admit I'm not up to date on the latest link draft!) in XML the same sort of situation applies. If so, one way to deal with it is to declare multiple NOTATIONs, one for each action you want to use. There ought to be a #DEFAULT notation for representing content negotiation -- what if a particular link might return a JPEG or GIF or PNG image or even descriptive text, but you can't in advance tell which? Sorry, a longish message for a simple point. Lee -- Liam Quin, lee@sq.com | lq-text freely available Unix text retrieval Senior Technical Consultant | FAQs: Metafont fonts, OPEN LOOK UI, OpenWindows SoftQuad Inc. +1 416 544-9000 | xfonttool (Unix xfontsel in XView) http://www.softquad.com/ | the barefoot programmer xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From ksaito at flab.fujitsu.co.jp Mon Mar 10 05:00:15 1997 From: ksaito at flab.fujitsu.co.jp (ksaito@flab.fujitsu.co.jp) Date: Mon Jun 7 16:57:32 2004 Subject: Which style first ? Re: Associating DSSSL style sheets with documents In-Reply-To: <2.2.32.19970310031233.00a601c0@jclark.com> Message-ID: <9703100458.AA01313@sanma.flab.fujitsu.co.jp> James Clark wrote... >I don't think it's safe to assume that an entity is intended to be be used >as a style sheet for some document simply because it is declared in the >document with a style sheet notation. Suppose for example, somebody was >writing a book about DSSSL: they might declare each of their example style >sheets as being entities with the DSSSL notation. Thank you for your suggestion. In your first example, that DSSSL book author does not use that example as DSSSL style-sheet. What he want to say in NOTAION is "This is DSSSL file". I think, about our idea, style-sheet entity's notation should mean "THIS IS STYLE-SHEET which described in DSSSL". This idea needs STYLE-SHEET notation. About this, lee@sq.com wrote ( thank you for your suggetion), >We don't clearly distinguish between the type of an object (e.g. >its data format, as determined by NOTATION or Mime Media Type) and >our intended use of it. Using NOTATION as intention for use of entity is not so good in SGML? I recongnize our idea's weak points by your suggentions. What I like my idea is no SGML extentions is not need. All of this idea is application matter. --------------------------------------------- ?????? ??? ????????? ?? ?? xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From lee at sq.com Mon Mar 10 05:37:56 1997 From: lee at sq.com (lee@sq.com) Date: Mon Jun 7 16:57:32 2004 Subject: Which style first ? Re: Associating DSSSL style sheets with documents Message-ID: <9703100537.AA05622@sqrex.sq.com> > lee@sq.com wrote ( thank you for your suggetion), > >We don't clearly distinguish between the type of an object (e.g. > >its data format, as determined by NOTATION or Mime Media Type) and > >our intended use of it. > > Using NOTATION as intention for use of entity is not so good in SGML? I did not mean to imply that NOTATION was not a good thing to use in conjunction with an entity in SGML. I wanted to point out that it is used for two different purposes. First, it is used to identify the type of the entity. Second, it is often used to specify how to process the entity. This assumes that all objects of the same type are always processed in the same way. But this is not the case. In any case, I would not expect an arbitrary style sheet to be referred to as an SGML entity unless, as in James' example, you want to refer to it explicitly in a document. A style sheet is not normally part of the actual document itself. If yuo want to link from a document to style sheet, you could reasonably (I think) use processing instructions. I don't yet know how XML will decide to do this. > --------------------------------------------- > ^[$BIY;NDL8&5f=j^[(J ^[$BL@@P8&^[(J ^[$B>pJs%5!<%S%98&5fIt^[(J > ^[$B@FF#^[(J ^[$B0l Message-ID: <9703100716.AA01315@sanma.flab.fujitsu.co.jp> lee@sq.com wrote... >I did not mean to imply that NOTATION was not a good thing to use >in conjunction with an entity in SGML. I'm sorry, I missunderstood your previous mail. >I wanted to point out that it is used for two different purposes. >First, it is used to identify the type of the entity. >Second, it is often used to specify how to process the entity. My idea is uses NOTATION as second purpose. >In any case, I would not expect an arbitrary style sheet to be >referred to as an SGML entity unless, as in James' example, you >want to refer to it explicitly in a document. No no, I don't want to refer in document. I want to refer from DTD. Many of DSSSL style-sheet will be depend on DTD, I think. I want to put style-sheet entity declearation on DTD. >A style sheet is not normally part of the actual document itself. About well-formed XML document which has no DTD, it is not so bad that such document has style-sheet or pointer to style-sheet for portability, I think. >If yuo want to link from a document to style sheet, you could >reasonably (I think) use processing instructions. I think PI is not good solution. When I write style-sheet as PI, I can only one style-sheet type. And since PI has no attribute or notation, application can't recognize notaion of PI. --------------------------------------------- KAZUMI Saito Fujitsu Labotatories Ltd. ISA Lab. xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From jjc at jclark.com Mon Mar 10 07:40:31 1997 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 16:57:32 2004 Subject: Associating DSSSL style sheets with documents Message-ID: <2.2.32.19970310072958.00addf08@jclark.com> At 16:16 10/03/97 +0900, ksaito@flab.fujitsu.co.jp wrote: >I think PI is not good solution. When I write style-sheet as PI, >I can only one style-sheet type. And since PI has no attribute or >notation, application can't recognize notaion of PI. It depends on how the PI is designed. We could have a PI that looked like this: James xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From ksaito at flab.fujitsu.co.jp Mon Mar 10 07:55:54 1997 From: ksaito at flab.fujitsu.co.jp (ksaito@flab.fujitsu.co.jp) Date: Mon Jun 7 16:57:32 2004 Subject: Associating DSSSL style sheets with documents In-Reply-To: <2.2.32.19970310072958.00addf08@jclark.com> Message-ID: <9703100754.AA01316@sanma.flab.fujitsu.co.jp> James Clark wrote... >It depends on how the PI is designed. We could have a PI that looked like this: > > > If my understanding is correct "type" in above example is not SGML attribute and SGML parser will not recognize it as attribute. Is this correct? --------------------------------------------- KAZUMI Saito Fujitsu Labotatories Ltd. ISA Lab. xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From jjc at jclark.com Mon Mar 10 08:28:10 1997 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 16:57:32 2004 Subject: Associating DSSSL style sheets with documents Message-ID: <2.2.32.19970310081739.00abd628@jclark.com> At 16:54 10/03/97 +0900, ksaito@flab.fujitsu.co.jp wrote: >James Clark wrote... >>It depends on how the PI is designed. We could have a PI that looked like this: >> >> >> > >If my understanding is correct "type" in above example is not >SGML attribute and SGML parser will not recognize it as attribute. >Is this correct? Right. Just like "version" in the XML PI that starts every SGML document. James xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From ksaito at flab.fujitsu.co.jp Mon Mar 10 08:58:28 1997 From: ksaito at flab.fujitsu.co.jp (ksaito@flab.fujitsu.co.jp) Date: Mon Jun 7 16:57:32 2004 Subject: Associating DSSSL style sheets with documents In-Reply-To: <2.2.32.19970310081739.00abd628@jclark.com> Message-ID: <9703100856.AA01317@sanma.flab.fujitsu.co.jp> James Clark wrote... >>If my understanding is correct "type" in above example is not >>SGML attribute and SGML parser will not recognize it as attribute. >>Is this correct? > >Right. Just like "version" in the XML PI that starts every SGML document. Then, we need a PI parser in addtion to a SGML parser, don't we ? --------------------------------------------- KAZUMI Saito Fujitsu Labotatories Ltd. ISA Lab. xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From Peter at ursus.demon.co.uk Mon Mar 10 10:19:17 1997 From: Peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:57:32 2004 Subject: XML QuotedCData question Message-ID: <4473@ursus.demon.co.uk> It seems that there is enough ambiguity or possible misinterpretation that this is a problem unless tackled. If WG or ERB members are reading this then they might wish to take note. In message <9703100330.AA04839@sqrex.sq.com> lee@sq.com writes: > The question about how to expand entities may arise, I think, because > XML, like SGML, is not layered. > > Most programming languages talk explicitly about tokenisation, > or tokenization if you prefer :-), and in doing so explain how > the sequence of tokens that a compiler (say) sees is derived from > an input stream. Usually, comments are stripped at this stage, > and in languages such as C or SGML that have (in effect) macros, > the macros are expanded at input time. Agreed. And having come from C I think in those terms. > > I'd personally like to see a version of the XML spec in which there > was no S production, but rather a list of things that are self-delimiting > (such as <) and don't require whitespace; the explanation about > entities would then be clearer. I hadn't realised this (S) was the problem :-) > > SGML entities can't all be expanded at input time, since some > of them are of differing types (e.g. external files) and must be > treated differently. I'm not sure whether this applies to XML > general entities or not, but it probably does -- do we have > NDATA entities? Entity substitution is very briefly defined in the draft. I don't know what it's like in 8879 (and I'm not going to find out!). I see the following problems: - it is *possible* (though I think unlikely) that not everyone on the ERB agrees as to what is meant to happen during substitution - parser implementers may: * find the spec not well-enough defined * interpret it in different ways - DTD implementers (i.e. those using PEs) may: * find the spec not well-enough defined * interpret it in 'incorrect' ways I have found 'programming' in SGML one of the most tedious and counter-intuitive things I have had to do. The primary problem has been entities, though RE hasn't helped. I had only two ways of proceeding: - if it failed with sgmls it was my fault - Joe English helped a great deal by answering 'simple' questions over e-mails. I finally ended up with a complex, hairy, and totally non-intuitive way (to non-SGML folk) set of DTDs and 'include' files. sgmls was the only way that I could tell whether it was 'right'. The only way that we can expect people to develop applications for XML using entities is: - be absolutely clear what we are doing - be as consistent as possible with past practice in SGML and provide guidance on conversion - have 100% accurate parsers - have very clear examples and torture tests - have tutorials My starting point would be to take HTML2.0 (or 3.2 or whatever), and make sure that the spec is capable of 100% accuracy in deciding what should happen. If not it needs revising. At present the immediate problem arises for Norbert (since his is the only validating parser we are working with) and those who are working with it. However PEs are used for other things than validation - I used them to 'add directory names' to a 'list of files' (i.e. manipulation of the location of general entities). Above all, of course, the XML documents must be valid SGML documents and they must give the same 'result' as when processed by sgmls. P. > > Maybe when the syntax settles down finally I'll do that. In a sense this is mainly the interpretation of the syntax and therefore the documentation rather than the productions (have I got that right?) P. > > Lee > > > xml-dev: A list for W3C XML Developers > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ > To unsubscribe, send to majordomo@ic.ac.uk the following message; > unsubscribe xml-dev > List coordinator, Henry Rzepa (rzepa@ic.ac.uk) > > -- Peter Murray-Rust, domestic net connection Virtual School of Molecular Sciences http://www.vsms.nottingham.ac.uk/ xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From richard at light.demon.co.uk Mon Mar 10 13:47:37 1997 From: richard at light.demon.co.uk (Richard Light) Date: Mon Jun 7 16:57:33 2004 Subject: Which style first ? Re: Associating DSSSL style sheets with documents In-Reply-To: <9703100537.AA05622@sqrex.sq.com> Message-ID: In message <9703100537.AA05622@sqrex.sq.com>, lee@sq.com writes > >I wanted to point out that it is used for two different purposes. >First, it is used to identify the type of the entity. >Second, it is often used to specify how to process the entity. > >This assumes that all objects of the same type are always >processed in the same way. But this is not the case. > >If yuo want to link from a document to style sheet, you could >reasonably (I think) use processing instructions. >I don't yet know how XML will decide to do this. Just a thought. XML is planning to give give us a more powerful [hyper]linking mechanism, with semantics on the links (yes?). Could we use that mechanism to differentiate the differing roles of external entities? Richard Light SGML and Museum Information Consultancy richard@light.demon.co.uk 3 Midfields Walk Burgess Hill West Sussex RH15 8JA U.K. tel. (44) 1444 232067 xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From jjc at jclark.com Mon Mar 10 14:33:36 1997 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 16:57:33 2004 Subject: Associating DSSSL style sheets with documents Message-ID: <2.2.32.19970310142302.00a767b8@jclark.com> At 17:56 10/03/97 +0900, ksaito@flab.fujitsu.co.jp wrote: >James Clark wrote... >>>If my understanding is correct "type" in above example is not >>>SGML attribute and SGML parser will not recognize it as attribute. >>>Is this correct? >> >>Right. Just like "version" in the XML PI that starts every SGML document. > >Then, we need a PI parser in addtion to a SGML parser, don't we ? Any use of PIs requires that the application interpret the contents of the PI. If you make the syntax of the PI the same as XML start-tags, then you may be able to use your XML parser to parse them. (It would handy if XML parsers provided a function that parsed a string like the contents of a start-tag.) James xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From jjc at jclark.com Mon Mar 10 14:33:49 1997 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 16:57:33 2004 Subject: XML QuotedCData question Message-ID: <2.2.32.19970310142309.00ae0294@jclark.com> At 09:24 10/03/97 GMT, Peter Murray-Rust wrote: >Above all, of course, the XML documents must be valid SGML documents and >they must give the same 'result' as when processed by sgmls. You won't in general be able to parse valid XML documents with sgmls. Two obvious reasons are that it doesn't support Unicode and it doesn't allow you to change delimiters. James xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From bosak at atlantic-83.Eng.Sun.COM Mon Mar 10 15:22:11 1997 From: bosak at atlantic-83.Eng.Sun.COM (Jon Bosak) Date: Mon Jun 7 16:57:33 2004 Subject: Which style first ? Re: Associating DSSSL style sheets with documents In-Reply-To: <9703100716.AA01315@sanma.flab.fujitsu.co.jp> (ksaito@flab.fujitsu.co.jp) Message-ID: <199703101520.HAA04606@boethius.eng.sun.com> [Kazumi Saito:] | Many of DSSSL style-sheet will be depend on DTD, I think. I want to | put style-sheet entity declearation on DTD. I think that it's a mistake to think of stylesheets as having a one-to-one relationship to DTDs. It is possible to make one stylesheet work with many DTDs (though this is probably not a good practice), and it is not only possible but also very useful to make a number of different stylesheets work with one DTD. | About well-formed XML document which has no DTD, it is not so bad that | such document has style-sheet or pointer to style-sheet for | portability, I think. For XML this will be the typical case. Typical XML documents transmitted over the Web will not have DTDs and will have to point to (or include) one or more stylesheets that are intended to be used with them. Jon xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From bosak at atlantic-83.Eng.Sun.COM Mon Mar 10 15:28:57 1997 From: bosak at atlantic-83.Eng.Sun.COM (Jon Bosak) Date: Mon Jun 7 16:57:33 2004 Subject: Associating DSSSL style sheets with documents In-Reply-To: <2.2.32.19970310072958.00addf08@jclark.com> (message from James Clark on Mon, 10 Mar 1997 14:29:58 +0700) Message-ID: <199703101525.HAA04608@boethius.eng.sun.com> [James Clark:] | >I think PI is not good solution. When I write style-sheet as PI, | >I can only one style-sheet type. And since PI has no attribute or | >notation, application can't recognize notaion of PI. | | It depends on how the PI is designed. We could have a PI that looked | like this: | | | As the naive content producer, I like this approach. (I assume that the value of the href can be any URL and that a browser that understood this syntax would cache the stylesheet just as it would any other recently retrieved resource.) Jon xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From nmikula at edu.uni-klu.ac.at Mon Mar 10 15:30:31 1997 From: nmikula at edu.uni-klu.ac.at (Norbert H. Mikula) Date: Mon Jun 7 16:57:33 2004 Subject: XML QuotedCData question References: <9703100330.AA04839@sqrex.sq.com> Message-ID: <33249550.2CA9@edu.uni-klu.ac.at> lee@sq.com wrote: > Most programming languages talk explicitly about tokenisation, > or tokenization if you prefer :-), and in doing so explain how > the sequence of tokens that a compiler (say) sees is derived from > an input stream. Usually, comments are stripped at this stage, > and in languages such as C or SGML that have (in effect) macros, > the macros are expanded at input time. I don't think that C and SGML/XML use or rather can use the same principle of includes/macros. C uses a pre-processor that resolves includes. Then the actual compiler gets started without having to worry about includes anymore. (To my understanding of things..) For practical reasons, at least for XML processors for online browsers, I think, we don't want to first do the include and then do the parsing, keeping all that stuff in memory while we do so. Furthermore I see problems arise if we have the following scenario : Too much to do for a pre-processor, I guess, it can, or at least should, include the appropriate external entity only after it has parsed and resolved the content of %Dos and %Unix. I am not sure whether I have addressed what you had in mind, but I do believe that XML is too smart for a pre-processor, thus we need other ways to look at PE resolving. -- Best regards, Norbert H. Mikula ===================================================== = SGML, DSSSL, Intra- & Internet, AI, Java ===================================================== = mailto:nmikula@edu.uni-klu.ac.at = http://www.edu.uni-klu.ac.at/~nmikula ===================================================== xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From tallen at sonic.net Mon Mar 10 15:40:43 1997 From: tallen at sonic.net (Terry Allen) Date: Mon Jun 7 16:57:33 2004 Subject: Style and read-only [was: Which style first?] Message-ID: <199703101540.HAA19301@bolt.sonic.net> Jon responds to Kazumi Saito: | | About well-formed XML document which has no DTD, it is not so bad that | | such document has style-sheet or pointer to style-sheet for | | portability, I think. | | For XML this will be the typical case. Typical XML documents | transmitted over the Web will not have DTDs and will have to point to | (or include) one or more stylesheets that are intended to be used with | them. My question is perhaps off-topic here on xml-dev, and I know everyone is busy preparing for WWW6, but I ask you all to reflect on it as an issue that needs resolution later on: What do I do to associate a style sheet with a read-only document, e.g., one that resides on some other server than my own, or that has been digitally signed? (And assume that this document has a doctype declaration already.) Regards, Terry Allen Electronic Publishing Consultant tallen[at]sonic.net specializing in Web publishing, SGML, and the DocBook DTD http://www.sonic.net/~tallen/ A Davenport Group Sponsor: http://www.ora.com/davenport/index.html xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From cbullard at hiwaay.net Mon Mar 10 16:02:05 1997 From: cbullard at hiwaay.net (Len Bullard) Date: Mon Jun 7 16:57:33 2004 Subject: Which style first ? Re: Associating DSSSL style sheets with documents References: <199703101520.HAA04606@boethius.eng.sun.com> Message-ID: <33242DA5.7B78@hiwaay.net> Jon Bosak wrote: > > [Kazumi Saito:] > > | Many of DSSSL style-sheet will be depend on DTD, I think. I want to > | put style-sheet entity declearation on DTD. > > I think that it's a mistake to think of stylesheets as having a > one-to-one relationship to DTDs. It is possible to make one > stylesheet work with many DTDs (though this is probably not a good > practice), I don't see why this is the case. In practice, where stylesheets are in use, the opposite has been the case. Many DTDs share a lot of structures and can in fact, share stylesheets. It is an issue of the degree of overlap. Where content tagging is practiced, it is often convenient to take another format-oriented stylesheet such as one might provide for HTML and use the style information with different GIs. Where DTDs vary only in degree, the same stylesheet is used. One very useful side effect is organizations quickly realize how many non-useful variations they have in their document structures and start looking for ways to winnow these out of their practices. One way to do that cheaply is to parse against a DTD the organization provides. The DTD becomes a corporate policy and a repository of corporate memory. This is useful when attempting large rehosting or conversion projects. The last thing I want when converting documents is a large collection of well-formed but inscrutable markup. > | About well-formed XML document which has no DTD, it is not so bad that > | such document has style-sheet or pointer to style-sheet for > | portability, I think. > > For XML this will be the typical case. Typical XML documents > transmitted over the Web will not have DTDs and will have to point to > (or include) one or more stylesheets that are intended to be used with > them. I'm not sure the common HTML web experience to date will be the most predictive model for sound practice with stylesheets or DTDs. The one example we have, HTML does have a DTD for whatever use is made of it. Our experience with DTDless processing was that people quickly found it necessary or convenient to create them although they don't transmit them often as you point out. As people who did not formerly practice SGML will learn, unvalidated markup is a nuisance, having a DTD is the best way to find out what was intended by the originator of a marked up instance, and is a rigorous expression of policy. len bullard xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From Peter at ursus.demon.co.uk Mon Mar 10 16:57:50 1997 From: Peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:57:33 2004 Subject: XML QuotedCData question Message-ID: <4497@ursus.demon.co.uk> Thanks James, In message <2.2.32.19970310142309.00ae0294@jclark.com> James Clark writes: > At 09:24 10/03/97 GMT, Peter Murray-Rust wrote: > > >Above all, of course, the XML documents must be valid SGML documents and > >they must give the same 'result' as when processed by sgmls. > > You won't in general be able to parse valid XML documents with sgmls. Two > obvious reasons are that it doesn't support Unicode and it doesn't allow you > to change delimiters. Understood. Is this also the same with NSGMLS? (I haven't moved on to these simply for technical porting reasons - my UNIX machine is too old). ? In which case if I rephrase the question to '... by SP or NSGMLS ' is this true? P. -- Peter Murray-Rust, domestic net connection Virtual School of Molecular Sciences http://www.vsms.nottingham.ac.uk/ xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From Peter at ursus.demon.co.uk Mon Mar 10 16:57:57 1997 From: Peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:57:33 2004 Subject: Transmission of multiple documents Message-ID: <4499@ursus.demon.co.uk> In message <199703101540.HAA19301@bolt.sonic.net> Terry Allen writes: [...] > > My question is perhaps off-topic here on xml-dev, and I know everyone I think this is on-topic and is a more general problem. Please excuse me if I'm wrong :-) > is busy preparing for WWW6, but I ask you all to reflect on it as > an issue that needs resolution later on: What do I do to associate > a style sheet with a read-only document, e.g., one that resides on > some other server than my own, or that has been digitally signed? > (And assume that this document has a doctype declaration already.) I interpret this to be a specific case of a more general problem - how do we associate multiple documents delivered over the WWW? (Among the documents are DTDs, entities of many sorts, transcluded documents, style sheets, and methods (e.g. Java classes). I am no expert here, but at least one person has suggested JAR files. Is this a generic solution for this type of problem (I agree it's probably only one of several solutions?) P. -- Peter Murray-Rust, domestic net connection Virtual School of Molecular Sciences http://www.vsms.nottingham.ac.uk/ xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From lee at sq.com Mon Mar 10 17:18:21 1997 From: lee at sq.com (lee@sq.com) Date: Mon Jun 7 16:57:33 2004 Subject: XML QuotedCData question Message-ID: <9703101717.AA18213@sqrex.sq.com> Thanks for replying, Norbert. You are taking me a little more literally than I meant -- you're right that macros in C are a cleaner design than the SGML botch, and can be implemented in a separate pass more easiy. However, > > > %DosSpecifics; > ]> is very like #define DOS 1 #ifdef DOS # include DosSpecifics #endif except that CPP allows general expressions there. It turns out that more robust programming language avoid macros altogether (e.g. C++) because there is isufficient compile-time checking, but that doesn't really affect XML! When I've looked at this in the past for SGML, it has seemed to me that one coud only do partial expansion with a pre-processor. But really I was thinking of a conceptually separate pass rather than a completely separate one -- you'd need to have some feedback and a shred symbol table. It may also be appropriate to treat parameter entities and text entities quite differently -- I'm not sure. Lee xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From lee at sq.com Mon Mar 10 17:25:17 1997 From: lee at sq.com (lee@sq.com) Date: Mon Jun 7 16:57:33 2004 Subject: Associating DSSSL style sheets with documents Message-ID: <9703101724.AA18420@sqrex.sq.com> Jon wrote: > [James Clark:] > | > | > > As the naive content producer, I like this approach. Yes. When I mentioned using a PI in my reply to the person from Japan (I'm sorry, I don't have your name!), this was exactly the sort of thing I had in mind. But then, this is more or less what Panorama does. Of course, it'd have to be for XML, no? > (I assume that the value of the href can be any URL and that a browser > that understood this syntax would cache the stylesheet just as it > would any other recently retrieved resource.) Yes, most browsers cache all remote resources that they fetch through URLs. I would expect the href to be a relative/parital URL, as per James' example, so treated as relative to the document containing the PI -- normally either the DTD or the actual body. For Panorama, I think the PI has to be in the DTD or subset, although it might be OK if it's before the DOCTYPE line too. Note that this is exactly the same problem as finding the DTD, and the same mechanisms ought to apply. Ideally, one would be able to fetch the first/main style sheet and the DTD at the same time, for the earliest possible display; since the DTD is optional, clearly the style sheet code can work without it. Lee xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From nmikula at edu.uni-klu.ac.at Mon Mar 10 19:46:54 1997 From: nmikula at edu.uni-klu.ac.at (Norbert Mikula) Date: Mon Jun 7 16:57:33 2004 Subject: XML Parser API : complete grove 0.1 Message-ID: I have, rather quick-hack like, put together a Java interface for an API (to NXP) that should be able to communicate a complete grove to an application. In other words, an application, after parsing, should be able to store back the document to the form it loaded it. Please do comment on it ! It is just another piece to something that hopfully one day will be a complete reference API to (Java based) XML parsers. Please have a look at : http://www.edu.uni-klu.ac.at/~nmikula/NXP/NXP.doc/ Please note that other classes that can be accessed from there have a lot of public methods that later on will be "privatised". (They have not been cleaned up either....) Best regards, Norbert H. Mikula ===================================================== = SGML, DSSSL, Intra- & Internet, AI, Java ===================================================== = mailto:nmikula@edu.uni-klu.ac.at = http://www.edu.uni-klu.ac.at/~nmikula ===================================================== xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From jjc at jclark.com Tue Mar 11 04:08:07 1997 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 16:57:33 2004 Subject: XML QuotedCData question Message-ID: <2.2.32.19970311035738.00676784@jclark.com> At 16:22 10/03/97 GMT, Peter Murray-Rust wrote: >Thanks James, > >In message <2.2.32.19970310142309.00ae0294@jclark.com> James Clark writes: >> At 09:24 10/03/97 GMT, Peter Murray-Rust wrote: >> >> >Above all, of course, the XML documents must be valid SGML documents and >> >they must give the same 'result' as when processed by sgmls. >> >> You won't in general be able to parse valid XML documents with sgmls. Two >> obvious reasons are that it doesn't support Unicode and it doesn't allow you >> to change delimiters. > >Understood. Is this also the same with NSGMLS? No. > (I haven't moved on to these >simply for technical porting reasons - my UNIX machine is too old). ? In which >case if I rephrase the question to '... by SP or NSGMLS ' is this true? Yes, with an appropriate SGML declaration. However you won't be able to parse well-formed but invalid XML documents, and you won't be able to validate XML documents for constraints that are in XML but not in SGML. James xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From jjc at jclark.com Tue Mar 11 04:08:31 1997 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 16:57:33 2004 Subject: Associating DSSSL style sheets with documents Message-ID: <2.2.32.19970311035745.00aecf3c@jclark.com> At 12:24 10/03/97 EST, lee@sq.com wrote: >Jon wrote: > >> [James Clark:] >> | >> | >> >> As the naive content producer, I like this approach. >Yes. When I mentioned using a PI in my reply to the person from >Japan (I'm sorry, I don't have your name!), this was exactly the >sort of thing I had in mind. > >But then, this is more or less what Panorama does. > >Of course, it'd have to be > >for XML, no? Well, this is something that is applicable to SGML in general not just to XML. Since >I would expect the href to be a relative/parital URL, as per James' >example, so treated as relative to the document containing the PI -- >normally either the DTD or the actual body. Right. It would be relative to the entity containing the PI. James xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From ht at cogsci.ed.ac.uk Tue Mar 11 09:10:22 1997 From: ht at cogsci.ed.ac.uk (Henry S. Thompson) Date: Mon Jun 7 16:57:33 2004 Subject: Style and read-only [was: Which style first?] In-Reply-To: Terry Allen's message of Mon, 10 Mar 1997 07:40:21 -0800 References: <199703101540.HAA19301@bolt.sonic.net> Message-ID: <2068.199703110910@grogan.cogsci.ed.ac.uk> > My question is perhaps off-topic here on xml-dev, and I know everyone > is busy preparing for WWW6, but I ask you all to reflect on it as > an issue that needs resolution later on: What do I do to associate > a style sheet with a read-only document, e.g., one that resides on > some other server than my own, or that has been digitally signed? > (And assume that this document has a doctype declaration already.) Create a stub document with the SAME DTD which has a single top-level element which replaces itself (using XML-LINK) with the document you care about. Or if you don't like links, like this ]> &rod; and in either case associate the style sheet with your stub in whatever way we end up agreeing on. ht xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From dgd at cs.bu.edu Tue Mar 11 15:46:45 1997 From: dgd at cs.bu.edu (David Durand) Date: Mon Jun 7 16:57:33 2004 Subject: Style and read-only [was: Which style first?] In-Reply-To: <2068.199703110910@grogan.cogsci.ed.ac.uk> References: Terry Allen's message of Mon, 10 Mar 1997 07:40:21 -0800 <199703101540.HAA19301@bolt.sonic.net> Message-ID: At 9:10 AM +0000 3/11/97, Henry S. Thompson wrote: And Terry Allen wrote: >> My question is perhaps off-topic here on xml-dev, and I know everyone >> is busy preparing for WWW6, but I ask you all to reflect on it as >> an issue that needs resolution later on: What do I do to associate >> a style sheet with a read-only document, e.g., one that resides on >> some other server than my own, or that has been digitally signed? >> (And assume that this document has a doctype declaration already.) First, I want to observe that Terry's point is very important... So we really need to address it. It cuts to the core of why stylesheet information needs to be loosely bound to a document. While I think that binding style information into documents at all is a short-sighted practice, what is more important is the ability to bind _new_ style information onto the document _later_. If you have that you can always ignore old, useless, or unwanted styles that are packaged with a document. >Create a stub document with the SAME DTD which has a single top-level >element which replaces itself (using XML-LINK) with the document you >care about. This should not work, as linking should cause stylesheet replacement -- and adding stylesheet semantics to links is worse. >Or if you don't like links, like this > > > >]> > >&rod; > This doesn't work when &rod; contains a -- which was exactly Terry's point. I think that CATALOG-based proposals may be the best way to accommodate such needs. Everything proposed for the style PI could fit as easily into the catalog, and be more general, and less-tightly bound to the document. >and in either case associate the style sheet with your stub in >whatever way we end up agreeing on. The problem is that you may not be able to create such a stub. Here's the (practical) stylesheet problem the really bothers me: HTTP 1.0 uses single connections per resource, and even HTTP 1.1 sends resources serially down the wire, although it can re-use the connection. This means that it will be hard to do incremental display of XML documents unless we can get the stylesheet coming down the wire _before_ the document itself. This seems problematic on several counts. Since HTTP 1.1 is meant to make multiple connections to the same serer unnecessary, the easy fix is ruled out by good network citizenship. This is another place where getting a CATALOG could tell you quickly what resources need to be fetched, and would let you get them in the right order. I know that we hope that many stylesheets will be cached at the client, but we can't count on that, especially from what I think I remember about cache coherence on the Web. -- David -- David _________________________________________ David Durand dgd@cs.bu.edu \ david@dynamicDiagrams.com Boston University Computer Science \ Sr. Analyst http://www.cs.bu.edu/students/grads/dgd/ \ Dynamic Diagrams --------------------------------------------\ http://dynamicDiagrams.com/ MAPA: mapping for the WWW \__________________________ xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From tallen at sonic.net Tue Mar 11 15:48:14 1997 From: tallen at sonic.net (Terry Allen) Date: Mon Jun 7 16:57:33 2004 Subject: Style and read-only [was: Which style first?] Message-ID: <199703111548.HAA13791@bolt.sonic.net> Henry Thompson writes in response to me: | > My question is perhaps off-topic here on xml-dev, and I know everyone | > is busy preparing for WWW6, but I ask you all to reflect on it as | > an issue that needs resolution later on: What do I do to associate | > a style sheet with a read-only document, e.g., one that resides on | > some other server than my own, or that has been digitally signed? | > (And assume that this document has a doctype declaration already.) | | Create a stub document with the SAME DTD which has a single top-level | element which replaces itself (using XML-LINK) with the document you | care about. Then the top-level element has to be a linking element, which is not true of most DTDs. But creating your own document is necessary, I think; it may have to be an instance of a DTD that defines the relations among the things pointed to. The other approach I can think of is a MIME type constructed for the purpose. | Or if you don't like links, like this | | | | ]> | | &rod; | | | and in either case associate the style sheet with your stub in | whatever way we end up agreeing on. That won't work if the read-only document has a doctype declaration, unless XML allows multiple doctype declarations (or I'm missing something). Regards, Terry Allen Electronic Publishing Consultant tallen[at]sonic.net specializing in Web publishing, SGML, and the DocBook DTD http://www.sonic.net/~tallen/ A Davenport Group Sponsor: http://www.ora.com/davenport/index.html xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From richard at cogsci.ed.ac.uk Tue Mar 11 17:22:47 1997 From: richard at cogsci.ed.ac.uk (Richard Tobin) Date: Mon Jun 7 16:57:34 2004 Subject: XML QuotedCData question Message-ID: <199703111722.RAA22137@deacon.cogsci.ed.ac.uk> I have another couple of questions about quoted cdata. (1) How should &foo!bar; be interpreted? According to the BNF it is completely valid, and not a reference. This seems undesirable from the point of view of human readability. (2) Why is left angle bracket excluded from quoted cdata? (3) Is the answer to (1) and (2) that it really should be ampersand that is excluded? -- Richard xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From bosak at atlantic-83.Eng.Sun.COM Fri Mar 14 07:17:54 1997 From: bosak at atlantic-83.Eng.Sun.COM (Jon Bosak) Date: Mon Jun 7 16:57:34 2004 Subject: XML parsers hit the big time Message-ID: <199703140716.XAA07782@boethius.eng.sun.com> Congratulations, XML implementors! You've just become strategic to a big industry initiative! >From Microsoft's press release announcing the Channel Definition Format (http://www.microsoft.com/corpinfo/press/1997/Mar97/Cdfrpr.htm): CDF will be easy for Web developers to adopt because it is based on XML, which has support among many third parties. XML has public domain software written in Java and other languages available now that can be used to parse CDF files. The CDF specification submission extends XML and Web Collections work that the W3C has in progress. These efforts will allow for open, HTML-based Web broadcasting based on standards-based technologies that are expected to have strong support among W3C members. Microsoft looks forward to other leading Web developers joining in support of this open standards effort. Jon xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From richard at cogsci.ed.ac.uk Fri Mar 14 18:05:00 1997 From: richard at cogsci.ed.ac.uk (richard@cogsci.ed.ac.uk) Date: Mon Jun 7 16:57:34 2004 Subject: References in default attribute values Message-ID: <29787.199703141804@pitcairn.cogsci.ed.ac.uk> If a default for an attribute value contains an entity reference, must the entity be declared before attribute list declaration? I cannot see such a requirement, and I find this surprising since there *is* such a requirement for (parameter) references in entity declarations. -- Richard xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From elm at arbortext.com Fri Mar 14 18:14:28 1997 From: elm at arbortext.com (Eve L. Maler) Date: Mon Jun 7 16:57:34 2004 Subject: Associating DSSSL style sheets with documents Message-ID: <3.0.16.19970314131001.35af3f68@village.doctools.com> At 10:57 AM 3/11/97 +0700, James Clark wrote: >At 12:24 10/03/97 EST, lee@sq.com wrote: >>Of course, it'd have to be >> >>for XML, no? > >Well, this is something that is applicable to SGML in general not just to >XML. Since would rather use simply browser should probably make the keyword user configurable. This is interesting: Should an XML effort determine a PI that should be usable in general by SGML documents? I tend to think that the "authority" that invents/maintains the format of the PI should be identified, and "XML" sort of fits the bill, similarly to . This way, " Message-ID: <3329A2AF.2906@hiwaay.net> Eve L. Maler wrote: > > I've also been beating the drum on the WG list about how our PIs should > have "GIs" as well as "attribute specs," so I'd prefer to see stylesheet att1="val1" att2="val2"... ?>. This way, " so that it will be processed by an XML-aware processor, and the rest > identifies the semantics of the instruction. This looks weirdly like DTD/instance built into the XML instance. So, XML then defines an application inside the instance? I understand it because this is how IADS and IDE/AS did links originally. However, it created interoperability problems and does to this day. What is the difference between this and a tag bag of empty elements included at the top of a DTD? len xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From jenglish at crl.com Fri Mar 14 19:49:56 1997 From: jenglish at crl.com (Joe English) Date: Mon Jun 7 16:57:34 2004 Subject: References in default attribute values In-Reply-To: <29787.199703141804@pitcairn.cogsci.ed.ac.uk> References: <29787.199703141804@pitcairn.cogsci.ed.ac.uk> Message-ID: <199703141948.AA01527@mail.crl.com> richard@cogsci.ed.ac.uk wrote: > If a default for an attribute value contains an entity reference, must > the entity be declared before attribute list declaration? I cannot > see such a requirement, and I find this surprising since there *is* > such a requirement for (parameter) references in entity declarations. I'm not positive about the rules in XML, but in SGML it _is_ necessary for the general entity declaration to appear first, as near as I can tell. (SGMLS agrees) By productions [143], [147], [33], and [34], the default value in an attribute definition is parsed as replaceable character data, which means that general entity references are recognized and replaced, and the rule that entities must be declared before they are referenced applies. [ Another, erm, interesting fact is that parameter entity references are _not_ replaced in attribute value literals in ATTLIST declarations. E.g.: A1 CDATA #FIXED %e1; -- and here -- A2 CDATA #FIXED "%e1;" -- but not here! -- > I've been fooled by this more than once... ] --Joe English jenglish@crl.com [143] attribute definition (11.3, 421:1) = ( attribute name [144], +ps [65], declared value [145], +ps [65], default value [147] ) [147] default value (11.3.4, 425:1) = ( ( ?( rni ("#"), "FIXED", +ps [65] ), attribute value specification [33] ) | ( rni ("#"), ( "REQUIRED" | "CURRENT" | "CONREF" | "IMPLIED" ) ) ) [33] attribute value specification (7.9.3, 331:1) = ( attribute value [35] | attribute value literal [34] ) [34] attribute value literal (7.9.3, 331:4) = ( ( lit ("\""), replaceable character data [46], lit ("\"") ) | ( lita ("'"), replaceable character data [46], lita ("'") ) ) xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From elm at arbortext.com Fri Mar 14 20:21:59 1997 From: elm at arbortext.com (Eve L. Maler) Date: Mon Jun 7 16:57:34 2004 Subject: Associating DSSSL style sheets with documents Message-ID: <3.0.16.19970314151234.1c5f0bce@village.doctools.com> At 01:10 PM 3/14/97 -0600, Len Bullard wrote: >Eve L. Maler wrote: >> >> I've also been beating the drum on the WG list about how our PIs should >> have "GIs" as well as "attribute specs," so I'd prefer to see > stylesheet att1="val1" att2="val2"... ?>. This way, "> so that it will be processed by an XML-aware processor, and the rest >> identifies the semantics of the instruction. > >This looks weirdly like DTD/instance built into the XML instance. >So, XML then defines an application inside the instance? > >I understand it because this is how IADS and IDE/AS did links >originally. However, it created interoperability problems >and does to this day. What is the difference between this >and a tag bag of empty elements included at the top of a DTD? > >len The difference is that, by convention, you're making PI markup available that's available to every document and to every *location* in a document if necessary, no matter what its DTD (and no matter whether it even has one). It just happens to look suspiciously like a start-tag, which may be helpful to any software that has to parse the PI string. I don't think links in general should be done this way, but I do believe in PIs being used for, uh, instructions to processors. (In other words, I'm not 100% against PIs, as some people are.) In particular, I'm starting to get very fond of PIs for anything that has to be specified per entity. Eve xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From bosak at atlantic-83.Eng.Sun.COM Fri Mar 14 20:32:52 1997 From: bosak at atlantic-83.Eng.Sun.COM (Jon Bosak) Date: Mon Jun 7 16:57:34 2004 Subject: Revised workshop: "XML: Where do we go from here?" Message-ID: <199703142031.MAA08159@boethius.eng.sun.com> The WWW6 workshop formerly titled "Delivery of Structured Documents over the Web" has been reorganized in light of recent announcements. Already driven by a rapidly growing set of implementations developed by individual experimenters, XML reached critical mass with the announcement this week of industry initiatives using XML as an enabling technology [1,2,3]. Now that XML seems assured a place in the pantheon of Internet standards, the question is, where do we go from here? This workshop will explore a variety of topics based on the interests of people actively working with XML. Representative topics include: APIs for XML parsers The role of Java in XML Is the grove concept helpful? Enabling a new authoring experience XML and Web objects XML stylesheets: CSS, DSSSL, or both? XML/HTML integration The workshop format will be a series of short presentations, one per participant, with a period of discussion following each presentation. The workshop will begin with a review of recent developments and an orientation to the larger picture that includes XML syntax, XML linking, scripting languages, and stylesheets. The purpose of this workshop is to explore future directions and offer XML experimenters an opportunity to exchange ideas and experiences. There are still places available in this workshop for qualified participants. Note that you must register for the workshop separately from the rest of the conference; for details, see http://www6conf.slac.stanford.edu/ If you are interested in participating in the XML workshop, please send a 1-2 paragraph summary of a topic that you would like to present to jon.bosak@sun.com The workshop materials are due immediately, so a response by Monday morning, March 16, is required for presentations that will be archived on the conference CD. [1] http://www.w3.org/pub/WWW/Submission/1997/2/Overview.html [2] http://www.w3.org/pub/WWW/Submission/1997/3/Overview.html [3] http://www.microsoft.com/corpinfo/press/1997/Mar97/Cdfrpr.htm ---------------------------------------------------------------------- Jon Bosak, Online Information Technology Architect, Sun Microsystems ---------------------------------------------------------------------- 2550 Garcia Ave., MPK17-101, Mountain View, California 94043 Davenport Group::SGML Open::NCITS V1::ISO/IEC JTC1/SC18/WG8::W3C XML If a man look sharply and attentively, he shall see Fortune; for though she be blind, yet she is not invisible. -- Francis Bacon ---------------------------------------------------------------------- xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From cbullard at hiwaay.net Fri Mar 14 21:05:37 1997 From: cbullard at hiwaay.net (Len Bullard) Date: Mon Jun 7 16:57:34 2004 Subject: Associating DSSSL style sheets with documents References: <3.0.16.19970314151234.1c5f0bce@village.doctools.com> Message-ID: <3329BAE0.3CC1@hiwaay.net> Eve L. Maler wrote: > > > The difference is that, by convention, you're making PI markup available > that's available to every document and to every *location* in a document if > necessary, no matter what its DTD (and no matter whether it even has one). > It just happens to look suspiciously like a start-tag, which may be helpful > to any software that has to parse the PI string. By convention? You mean, by application. An inclusion on root makes an empty element available to every location. A PI is something every document has to have. That isn't an improvement. If you use a DOCTYPE and know the DTD, don't you get the same effect? XML goes out it's way to load up an instance just to get around a DTD. I question the utility of that. We tell them they are being freed of fixed markup, then add a question mark and say, oh, that's OK, that's XML. > I don't think links in general should be done this way, but I do believe in > PIs being used for, uh, instructions to processors. Ummm... sure. Sort of what links are. > (In other words, I'm > not 100% against PIs, as some people are.) In particular, I'm starting to > get very fond of PIs for anything that has to be specified per entity. No doubt. len xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From elm at arbortext.com Fri Mar 14 22:39:24 1997 From: elm at arbortext.com (Eve L. Maler) Date: Mon Jun 7 16:57:34 2004 Subject: Associating DSSSL style sheets with documents Message-ID: <3.0.16.19970314173449.1c972e2e@village.doctools.com> At 02:53 PM 3/14/97 -0600, Len Bullard wrote: >Eve L. Maler wrote: >> >> >> The difference is that, by convention, you're making PI markup available >> that's available to every document and to every *location* in a document if >> necessary, no matter what its DTD (and no matter whether it even has one). >> It just happens to look suspiciously like a start-tag, which may be helpful >> to any software that has to parse the PI string. > >By convention? You mean, by application. I'm not sure I catch your distinction. If we agree on a meaning and a syntax for it, we've made a convention. (Like when everyone asks "How are you?" and expects a short, positive answer. :-) Applications can now predictably act on the usage of the convention. (Like when someone starts to walk away after a moment, safe -- usually! -- in the assumption that the other person just answered "I'm fine.") >An inclusion on root makes an empty element available >to every location. A PI is something every document has to have. >That isn't an improvement. If you use a DOCTYPE and know the DTD, >don't >you get the same effect? XML goes out it's way to load up an >instance just to get around a DTD. I question the utility of that. >We tell them they are being freed of fixed markup, then add a >question mark and say, oh, that's OK, that's XML. But XML doesn't have inclusions, and any one document may not even have DTDs. So your "ifs" sometimes don't come true. I agree that we don't want to push legitimate DTD functions into PIs, which give you a lot less validation power. But processing instructions (in the regular English sense) don't belong in the normal markup scheme most of the time. >> I don't think links in general should be done this way, but I do believe in >> PIs being used for, uh, instructions to processors. > >Ummm... sure. Sort of what links are. Well, a reference to a stylesheet is surely a link, but not all links are references to stylesheets. Also, not all processing instructions are links to something. Do you think PIs are never appropriate? >> (In other words, I'm >> not 100% against PIs, as some people are.) In particular, I'm starting to >> get very fond of PIs for anything that has to be specified per entity. > >No doubt. > >len Eve xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From cbullard at hiwaay.net Fri Mar 14 23:26:21 1997 From: cbullard at hiwaay.net (Len Bullard) Date: Mon Jun 7 16:57:34 2004 Subject: Associating DSSSL style sheets with documents References: <3.0.16.19970314173449.1c972e2e@village.doctools.com> Message-ID: <3329DBE3.787B@hiwaay.net> Eve L. Maler wrote: > > At 02:53 PM 3/14/97 -0600, Len Bullard wrote: > >Eve L. Maler wrote: > >> > >> > >> The difference is that, by convention, you're making PI markup available > >> that's available to every document and to every *location* in a document if > >> necessary, no matter what its DTD (and no matter whether it even has one). > >> It just happens to look suspiciously like a start-tag, which may be helpful > >> to any software that has to parse the PI string. > > > >By convention? You mean, by application. > > I'm not sure I catch your distinction. If we agree on a meaning and a > syntax for it, we've made a convention. If we agree on a convention, one of us can break it at any time without a serious penalty. If we make a contract, either can enforce it. The PI is a contract. So is the DTD we're trying to avoid with a hack. > But XML doesn't have inclusions, and any one document may not even have > DTDs. So your "ifs" sometimes don't come true. I agree that we don't want > to push legitimate DTD functions into PIs, which give you a lot less > validation power. But processing instructions (in the regular English > sense) don't belong in the normal markup scheme most of the time. Then why are they in the data? Why were they deprecated? What is in the SGML Way that is being overlooked here? Why is it being overlooked? Which is wrong: the SGML Way or the use of PIs? IOW, what the PIs you suggest do is put metainformation inside an instance. Why? What is it they will convey that an XML engine will not already know by reading the specification or could know by reading a DTD? Is the DTD not there simply because members of the Working Group don't want them to be but now can't find a way to get around the functionality they provided? > Well, a reference to a stylesheet is surely a link, but not all links are > references to stylesheets. Also, not all processing instructions are links > to something. Do you think PIs are never appropriate? I didn't say that. I'm wondering why they are suddenly a preferred practice when they were formerly a deprecated practice? What is worse, a DTD I send once and might be very small, or PIs I send every time? len xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From tbray at textuality.com Fri Mar 14 23:36:28 1997 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 16:57:34 2004 Subject: Associating DSSSL style sheets with documents Message-ID: <3.0.32.19970314153455.009a2ae0@pop.intergate.bc.ca> At 05:14 PM 3/14/97 -0600, Len Bullard wrote: >IOW, what the PIs you suggest do is put metainformation inside >an instance. Why? What is it they will convey that an XML engine >will not already know by reading the specification or could know >by reading a DTD? Well, a DTD, considered as metadata, is pretty thin. It doesn't contain any semantic information, nor much in the way of strong data typing. I can't think of much that is useful for downstream processing that would naturally live inside a DTD. The problem of packaging, of tying the things that you *do* need (stylesheets, topical metadata, typing rules) to documents is a real one and worth spending time on. But there is no reason to believe that a DTD is a very important part of such a solution. Secondly, the distinction between data and metadata is, at a deep level, bogus; totally in the eye of the beholder. For this reason, it is always good and never bad to make what the author may consider metadata available along with what the author considers data. Because the author is usually wrong. >[re PI's:] I'm wondering why they are suddenly a preferred >practice when they were formerly a deprecated practice? What is >worse, a DTD I send once and might be very small, or PIs I send >every time? Reasonable people may disagree. I have no trouble in saying that I think that PIs are a useful thing, and a necessary part of real-world document processing. Thus, yes (gasp) I disagree with the language in the SGML standard deprecating PIs, and I see no reason for us to consider ourselves bound by it. As for once vs. many, I think that it is in general A Good Thing for documents on the web to be self-contained whenever possible. And while I think it is indeed smart to try to avoid retransmitting fixed ancillary files (metadata, stylesheets, whatever), I don't think that this class of files includes DTDs that often for the downstream processing tasks I've seen. - Tim xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From Peter at ursus.demon.co.uk Fri Mar 14 23:47:10 1997 From: Peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:57:34 2004 Subject: XML parsers hit the big time Message-ID: <4640@ursus.demon.co.uk> In message <199703140716.XAA07782@boethius.eng.sun.com> bosak@atlantic-83.Eng.Sun.COM (Jon Bosak) writes: > Congratulations, XML implementors! You've just become strategic to a > big industry initiative! [...notice of CDF release...] I'd like to welcome the involvement of commercial developers and extend a special welcome to any newcomers to contribute to the list. [We appreciate that involvement in an XML-project may be sensitive and that you may not wish to publicise this.] I think it's particularly important that fuzzy areas of the spec are discussed, because none of us want 'very slightly different versions of XML'. P. -- Peter Murray-Rust, domestic net connection Virtual School of Molecular Sciences http://www.vsms.nottingham.ac.uk/ xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From Peter at ursus.demon.co.uk Fri Mar 14 23:47:13 1997 From: Peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:57:34 2004 Subject: Indexing of XML documents Message-ID: <4641@ursus.demon.co.uk> I hope I can express this problem clearly - I'm sure that you are familiar with it. When we need to resolve a TEI pointer like (id a23) we may have to scan the whole document. In general we will wish to cache (index) IDs since we don't wish to rescan for another search. One obvious place to do this is when the document is first read in (admittedly there may never be a need to scan the whole document). When validating a document the IDs, GIs and ATTNAMEs all have to be scanned since they occur in VC's. Presumably as a by-product of validation we can at least expect a hashtable of IDs (and possibly GIs). The question is, should we do both of these by default (or even others that I haven't thought of)? Or should we do none and leave it to the app? Or should the parser have a switch? P. [BTW a WF document can have multiple identical IDs, OK? Presumably the behaviour of an app that has to reference them is 'undefined'?] -- Peter Murray-Rust, domestic net connection Virtual School of Molecular Sciences http://www.vsms.nottingham.ac.uk/ xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From lee at sq.com Sat Mar 15 02:24:16 1997 From: lee at sq.com (lee@sq.com) Date: Mon Jun 7 16:57:34 2004 Subject: Indexing of XML documents Message-ID: <9703150224.AA05729@sqrex.sq.com> > When we need to resolve a TEI pointer like (id a23) we may have to scan > the whole document. This all depends on who "we" is taken to be. A web indexing robot doesn't need to resolve tei pointers at all, except to identify the remote document -- it then indexes the whole thing. > In general we will wish to cache (index) IDs since > we don't wish to rescan for another search. I don't follow this. Under what circumstances is searching a document for an ID much more painful than using a cache? Is this for 100 MByte documents? (which do exist, by the way, droves. No, like elephants, in herds) > When validating a document the IDs, GIs and ATTNAMEs all have to be scanned > since they occur in VC's. Not sure what a VC is (validatable context??) but yes, they all have to be validated. > Presumably as a by-product of validation we can > at least expect a hashtable of IDs (and possibly GIs). I think that should be application-specific. You might provide a hash table interface to make it easier, though. Lee xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From cbullard at hiwaay.net Sat Mar 15 04:11:02 1997 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 16:57:34 2004 Subject: Associating DSSSL style sheets with documents References: <3.0.32.19970314153455.009a2ae0@pop.intergate.bc.ca> Message-ID: <332A2143.7AE3@hiwaay.net> Tim Bray wrote: > > Well, a DTD, considered as metadata, is pretty thin. It doesn't > contain any semantic information, nor much in the way of strong > data typing. I can't think of much that is useful for downstream > processing that would naturally live inside a DTD. You must not do much conversion work. It is awfully handy the first time one sees the document instance. It sure is the cheap way (say, non-programming) to figure out what the intended structure is. > The problem of > packaging, of tying the things that you *do* need (stylesheets, > topical metadata, typing rules) to documents is a real one and worth > spending time on. But there is no reason to believe that a DTD > is a very important part of such a solution. Tieing stylesheets, no you are right. Topical metadata, typing rules, I'm not so sure. Most of SGML practice to date works something like that. > Secondly, the distinction between data and metadata is, at a deep > level, bogus; totally in the eye of the beholder. To some extent, that is true. > For this reason, > it is always good and never bad to make what the author may > consider metadata available along with what the author > considers data. Because the author is usually wrong. If the source system/author is wrong, this whole XML thing is suddenly bogus. It is a matter of how one wants to package the data. I think using #FIXED attributes works pretty well. There are some awfully good HyTime browsers out there. Ask Fujitsu about the one they have. The CaPH folks and Biezunski might disagree as well. > >[re PI's:] I'm wondering why they are suddenly a preferred > >practice when they were formerly a deprecated practice? What is > >worse, a DTD I send once and might be very small, or PIs I send > >every time? > > Reasonable people may disagree. We are all reasonable people. That doesn't answer the second question. Why send PIs every time if what I need to know is in a public specification, eg, a DTD? > I have no trouble in saying that > I think that PIs are a useful thing, and a necessary part of real-world > document processing. Thus, yes (gasp) I disagree with the language > in the SGML standard deprecating PIs, and I see no reason for us > to consider ourselves bound by it. Oh, that part I agree with. We've used PIs quite a bit even before they were cool. So does Arbortext. Only the religious among us don't. > As for once vs. many, I think that it is in general A Good Thing > for documents on the web to be self-contained whenever possible. Sure. And when not possible, it's nice to validate. > And while I think it is indeed smart to try to avoid retransmitting > fixed ancillary files (metadata, stylesheets, whatever), I don't > think that this class of files includes DTDs that often for the > downstream processing tasks I've seen. - Tim I like to have a generalized editor that works first time and every time. I hate having to keep fifteen of them for chores that overlap. They are easier to write than parsers, even XML parsers. Anyway, the DTD is easier to explain than the PIs, and I get to build them as I need when I need. A set of XML PIs and a set of ArborText PIs work out to be the same thing: fized process flags. or are both value-pair lists. I don't see the difference except that for I can always build turn on a free parser and find out what I have without hiring a CS grad to find out for me. When the data is ten years old, that has advantages... waaaaaay downstream len xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From bosak at atlantic-83.Eng.Sun.COM Sat Mar 15 04:49:26 1997 From: bosak at atlantic-83.Eng.Sun.COM (Jon Bosak) Date: Mon Jun 7 16:57:35 2004 Subject: XML demos for Developer's Day Message-ID: <199703150447.UAA08810@boethius.eng.sun.com> Here (in no particular order) is the list of demos I have lined up for the implementor's session in the XML track on Developer's Day at the World Wide Web conference. Some of these are tentative, depending on whether the project in question is actually running by Developer's Day (Friday, April 11). Please let me know if I've gotten anything wrong or left anyone out. Sun Microsystems XML Web site ICL XML server ArborText XML editor Inso XML converter, XML Web server, XML local browser RivCom XML Netscape plug-in Univ. of Edinburgh XML tools, DSSSL syntax checker Open Molecule Fndtn. XML processor/renderer Fujitsu Laboratories XML/DSSSL browser Kevin Grimes XML processor Tim Bray XML parser Norbert Mikula XML parser, DSSSL engine ---------------------------------------------------------------------- Jon Bosak, Online Information Technology Architect, Sun Microsystems ---------------------------------------------------------------------- 2550 Garcia Ave., MPK17-101, Mountain View, California 94043 Davenport Group::SGML Open::NCITS V1::ISO/IEC JTC1/SC18/WG8::W3C XML If a man look sharply and attentively, he shall see Fortune; for though she be blind, yet she is not invisible. -- Francis Bacon ---------------------------------------------------------------------- xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From jjc at jclark.com Sat Mar 15 08:51:39 1997 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 16:57:35 2004 Subject: Associating DSSSL style sheets with documents Message-ID: <2.2.32.19970315084050.0105f4f0@jclark.com> At 13:10 14/03/97 -0500, Eve L. Maler wrote: >At 10:57 AM 3/11/97 +0700, James Clark wrote: >>At 12:24 10/03/97 EST, lee@sq.com wrote: >>>Of course, it'd have to be >>> >>>for XML, no? >> >>Well, this is something that is applicable to SGML in general not just to >>XML. Since >would rather use simply >browser should probably make the keyword user configurable. > >This is interesting: Should an XML effort determine a PI that should be >usable in general by SGML documents? I wasn't proposing that *XML* define such a PI. All I was just suggesting was that people who have DSSSL engines implement it (preferably making the name of the PI configurable). >I tend to think that the "authority" >that invents/maintains the format of the PI should be identified, and "XML" >sort of fits the bill, similarly to token in a PI functions as a sort of notation. It would be weird for an >XML spec to specify >I've also been beating the drum on the WG list about how our PIs should >have "GIs" as well as "attribute specs," so I'd prefer to see stylesheet att1="val1" att2="val2"... ?>. This way, "so that it will be processed by an XML-aware processor, and the rest >identifies the semantics of the instruction. I disagree. XML requires that all PIs start with a name, and says that this name is normally the name of a declared notation. So I think PIs should look like (Note that the currently-defined XML PI fits this pattern not the one you suggest.) The authority should come from the public identifier on the notation declaration for name. Since XML reserves all names beginning with XML-, I would think that an XML-defined PI should look like: James xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From Peter at ursus.demon.co.uk Sat Mar 15 18:32:18 1997 From: Peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:57:35 2004 Subject: Indexing of XML documents Message-ID: <4670@ursus.demon.co.uk> In message <9703150224.AA05729@sqrex.sq.com> lee@sq.com writes: > > When we need to resolve a TEI pointer like (id a23) we may have to scan > > the whole document. > > This all depends on who "we" is taken to be. > > A web indexing robot doesn't need to resolve tei pointers at all, > except to identify the remote document -- it then indexes the whole thing. I am guilty of imprecision ( sorry :-) I meant an internal indexing of the document tree, not an index to locate the document. > > > In general we will wish to cache (index) IDs since > > we don't wish to rescan for another search. > I don't follow this. Under what circumstances is searching a document for > an ID much more painful than using a cache? Is this for 100 MByte documents? > (which do exist, by the way, droves. No, like elephants, in herds) Yes - I was thinking of exactly that. Particularly if the document contains thousands of elements (e.g. large chunks of HTML-like material). > > > When validating a document the IDs, GIs and ATTNAMEs all have to be scanned > > since they occur in VC's. > Not sure what a VC is (validatable context??) but yes, they all have to > be validated. VC = 'validity constraint' - see XML-draft 1.4 and abbreviated as this in later places. The point is that (say) in production 52 all IDs have to be scanned for uniqueness. Therefore at this stage it could be useful to hash them so that they could be extracted rapidly if they form part of a later search, rather than going through the whole doc again. It's no big deal - but since I found myself doing it for various searches, it seemed worth thinking about in the API. P. -- Peter Murray-Rust, domestic net connection Virtual School of Molecular Sciences http://www.vsms.nottingham.ac.uk/ xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From sgmlsh at CAM.ORG Sat Mar 15 18:41:05 1997 From: sgmlsh at CAM.ORG (Sam Hunting) Date: Mon Jun 7 16:57:35 2004 Subject: Associating DSSSL style sheets with documents In-Reply-To: <332A2143.7AE3@hiwaay.net> Message-ID: > Because the author is usually wrong. All Cretans are liars? xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From bosak at atlantic-83.Eng.Sun.COM Sun Mar 16 02:35:49 1997 From: bosak at atlantic-83.Eng.Sun.COM (Jon Bosak) Date: Mon Jun 7 16:57:35 2004 Subject: XML demos for Developer's Day In-Reply-To: <2.2.32.19970316010332.00734140@sover.net> (message from Liora Alschuler on Sat, 15 Mar 1997 20:03:32 -0500) Message-ID: <199703160234.SAA09440@boethius.eng.sun.com> [Liora Alschuler:] | I would like to include this list in my coverage of the xml conf in | terms of what was shown in San Diego and what will be in Santa | Clara. Anyone object to the mention? Jon? I would very much prefer not to see this publicized right now. As I said, the list is tentative. One of the reasons I posted it was to see where everyone is in their planning right now. Some of the experimenters won't know until Thursday night whether they will have something to show on Friday. I would hate to see us publicize a list and then have some of the promised demos not occur. I would prefer to just say that there will be a demo session and let the rest be a surprise. Jon xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From Ingo.Macherius at tu-clausthal.de Sun Mar 16 03:44:16 1997 From: Ingo.Macherius at tu-clausthal.de (Ingo Macherius) Date: Mon Jun 7 16:57:35 2004 Subject: MBone Message-ID: <199703160343.EAA06818@kneipfix.rz.tu-clausthal.de> This is a bit off topic, sorry. I don't have the opportunity to go to WWW6, but luckily I have MBone connection and see there are four channels prepared. Unluckily I tested transmissions from the US and found the Germany-USA link insufficient to deliver understandable speech. So my question is, whether the sessions are recorded and avaliable for download. This would enable me to watch. Second question: What's the broadcast schedule for the XML sessions ? Thanks in advance. ++im -- Snail : Ingo Macherius // L'Aigler Platz 4 // D-38678 Clausthal-Zellerfeld Mail : Ingo.Macherius@tu-clausthal.de WWW: http://www.tu-clausthal.de/~inim/ Information!=Knowledge!=Wisdom!=Truth!=Beauty!=Love!=Music==BEST (Frank Zappa) xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From bosak at atlantic-83.Eng.Sun.COM Sun Mar 16 06:58:22 1997 From: bosak at atlantic-83.Eng.Sun.COM (Jon Bosak) Date: Mon Jun 7 16:57:35 2004 Subject: MBone In-Reply-To: <199703160343.EAA06818@kneipfix.rz.tu-clausthal.de> (message from Ingo Macherius on Sun, 16 Mar 1997 04:43:58 +0100 (MET)) Message-ID: <199703160657.WAA16049@boethius.eng.sun.com> | I don't have the opportunity to go to WWW6, but luckily I have MBone | connection and see there are four channels prepared. Unluckily I | tested transmissions from the US and found the Germany-USA link | insufficient to deliver understandable speech. So my question is, | whether the sessions are recorded and avaliable for download. This | would enable me to watch. Second question: What's the broadcast | schedule for the XML sessions ? You will have to ask the conference organizers about this. http://www6conf.slac.stanford.edu/ Jon xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From Peter at ursus.demon.co.uk Sun Mar 16 10:07:00 1997 From: Peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:57:35 2004 Subject: MBone Message-ID: <4715@ursus.demon.co.uk> In message <199703160343.EAA06818@kneipfix.rz.tu-clausthal.de> Ingo Macherius writes: > This is a bit off topic, sorry. Not to me, :-) Don't worry, I think that it could be useful in the future for xml-dev to consider virtual working of various sorts. I have been extremely impressed by the way that the discussion on the list has gone, but there are clearly areas where a more rapid feedback than e-mail would be useful. Perhaps we are a year or so away, but we shall start to see other methods like MBone being useful for bridging the Atlantic. > I don't have the opportunity to go to WWW6, but luckily I have MBone > connection and see there are four channels prepared. Unluckily I tested > transmissions from the US and found the Germany-USA link insufficient > to deliver understandable speech. So my question is, whether the sessions > are recorded and avaliable for download. This would enable me to watch. > Second question: What's the broadcast schedule for the XML sessions ? > Thanks in advance. If you get information, Ingo, it would be useful to post it here, along with any other necessary information for connection. Any feedback from the meeting would be useful. P. -- Peter Murray-Rust, domestic net connection Virtual School of Molecular Sciences http://www.vsms.nottingham.ac.uk/ xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From h.rzepa at ic.ac.uk Sun Mar 16 10:15:15 1997 From: h.rzepa at ic.ac.uk (Rzepa, Henry) Date: Mon Jun 7 16:57:35 2004 Subject: MBone In-Reply-To: <4715@ursus.demon.co.uk> Message-ID: >> I don't have the opportunity to go to WWW6, but luckily I have MBone >> connection and see there are four channels prepared. Unluckily I tested >> transmissions from the US and found the Germany-USA link insufficient >> to deliver understandable speech. In my experience, MBone has never proved really useful (I have attended WWW2 and an IETF meeting where it was used, but it was only a token really). Henry Rzepa. +44 171 594 5774 (Office) +44 594 5804 (Fax) xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From elm at arbortext.com Mon Mar 17 19:52:16 1997 From: elm at arbortext.com (Eve L. Maler) Date: Mon Jun 7 16:57:35 2004 Subject: Associating DSSSL style sheets with documents Message-ID: <3.0.16.19970317144739.1dbfb74c@village.doctools.com> At 03:40 PM 3/15/97 +0700, James Clark wrote: >At 13:10 14/03/97 -0500, Eve L. Maler wrote: ... >>This is interesting: Should an XML effort determine a PI that should be >>usable in general by SGML documents? > >I wasn't proposing that *XML* define such a PI. All I was just suggesting >was that people who have DSSSL engines implement it (preferably making the >name of the PI configurable). Oh, I see. >>I tend to think that the "authority" >>that invents/maintains the format of the PI should be identified, and "XML" >>sort of fits the bill, similarly to >token in a PI functions as a sort of notation. It would be weird for an >>XML spec to specify > >>I've also been beating the drum on the WG list about how our PIs should >>have "GIs" as well as "attribute specs," so I'd prefer to see >stylesheet att1="val1" att2="val2"... ?>. This way, ">so that it will be processed by an XML-aware processor, and the rest >>identifies the semantics of the instruction. > >I disagree. XML requires that all PIs start with a name, and says that this >name is normally the name of a declared notation. So I think PIs should >look like > > I'm not sure how your second sentence follows. Why not have XML as the notation (that is, XML-handling processors should operate on this PI) and still have a "GI" that indicates the subclass of XML PI? (But see below also.) >(Note that the currently-defined XML PI fits this pattern not the one you >suggest.) The authority should come from the public identifier on the >notation declaration for name. Since XML reserves all names beginning with >XML-, I would think that an XML-defined PI should look like: > > This is a good point. In that case, then the XML PI at the top should start with ", then you can't easily distinguish among the PIs by type. Eve xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From bosak at atlantic-83.Eng.Sun.COM Tue Mar 18 00:50:58 1997 From: bosak at atlantic-83.Eng.Sun.COM (Jon Bosak) Date: Mon Jun 7 16:57:35 2004 Subject: XML hot from the oven Message-ID: <199703180049.QAA28043@boethius.eng.sun.com> I am pleased to announce that we are now serving XML as an experimental alternative data format from our corporate document server, docs.sun.com. Documents in the SGML repository at docs.sun.com are autochunked and converted on the fly to XML. This simply mirrors the server's primary function of converting SGML on the fly to HTML; the main difference is that the job of converting to XML is currently almost an identity transformation and is therefore much easier. docs.sun.com(sm) is itself experimental and unpublicized, so this is an experiment running on top of an experiment, but we are proud to claim the honor of having the world's first publicly visible XML server. While the XML data stream is extremely raw in this first implementation, the document repository is not; docs.sun.com currently provides more than half of the total Solaris 2.5.1 manual set online, and all of it can now be accessed as an XML data stream. Kudos to the SunSoft AnswerBook team for making this service available on top of everything else they are doing to meet our Solaris release schedules. HOW TO GET IT The SGML-based AnswerBook2 (ab2) manuals on docs.sun.com are organized into several large categories (alluser, sysadmin, etc.) with a number of books in each catagory. Thus, the Solaris Advanced User's Guide is referred to in URLs as /ab2/alluser/ADVOSUG. Two forms of XML access are currently supported: TOCs and document chunks. TOCs are accessed via the @xmlToc template, and chunks are accessed via the @xmlChunk template. The @xmlToc template always shows a table of contents down to the chapter level, no matter what level it is invoked at. To see the XML server in action, telnet to docs.sun.com with the command telnet docs.sun.com 80 When connected, you can issue one of several kinds of GET command to cause an HTTP transfer. For example: 1. To get a chapter-level TOC of the entire contents of the server: get /ab2/@xmlToc http/1.0 2. To get a chapter-level TOC of the manuals in the alluser category: get /ab2/alluser/@xmlToc http/1.0 3. To get a chapter-level TOC of the Solaris Advanced User's Guide: get /ab2/alluser/ADVOSUG/@xmlToc http/1.0 4. To get a particular chapter from the manual (as listed in the TOC): get /ab2/alluser/ADVOSUG/@xmlChunk/113 http/1.0 Note that HTTP GET commands must always be terminated with TWO carriage returns before anything happens. Hint: you will find the output easier to handle if you do all this from within an emacs shell session. Beyond its primary goal of giving us bragging rights, this service is intended to provide a large-scale test bed for XML experimenters. At the moment, all we can do is the simple identity transform from the DocBook-tagged source, but in a few days we will have permissions set up to go in and provide multiple alternative treatments in order to explore different kinds of delivery strategies (for example, the generation of SGML Open fragment wrappers vs. full server-side entity resolution, or embedded CSS style attributes vs. associated dsssl-o style sheets). We hope that this service will help to further the evolution of XML by giving all you developers a rich set of alternatives to play with. Have fun! Jon ---------------------------------------------------------------------- Jon Bosak, Online Information Technology Architect, Sun Microsystems ---------------------------------------------------------------------- 2550 Garcia Ave., MPK17-101, Mountain View, California 94043 Davenport Group::SGML Open::NCITS V1::ISO/IEC JTC1/SC18/WG8::W3C XML Here's a little game you can all join in with It's very simple and I hope it's new Make your own tags up if you want to Any old tags that you think will do ---------------------------------------------------------------------- xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From lee at sq.com Tue Mar 18 05:16:25 1997 From: lee at sq.com (lee@sq.com) Date: Mon Jun 7 16:57:35 2004 Subject: Associating DSSSL style sheets with documents Message-ID: <9703180516.AA16683@sqrex.sq.com> Jon Bosak wrote: > One possible method suggested by James Clark (thank you, James) is to > adopt the convention used by Jade in the absence of the -d option: > replace the extension of the document entity's URL or file name with > .dsl and fetch that. Thus, if a browser fetches > > http://docs.sun.com/foo/bar.html > > then it should also look for > > http://docs.sun.com/foo/bar.dsl > > and apply it to bar.html if found. Note that if you are generating the XML from a CGi script, a Java server plugin (e.g. Solaris 2.6's upcoming server) or otherwise, you probably need to make sure that clients don't try to look for files in the same "directory" as your SGML. E.g. http://docs.su.com/ab2/alluser/ADVOSUG/@xmlChunk/113.dsl does a database query into presumably DynaBase (right, Jon?). In this case, you want a processing instruction (or some other markup) to say that * there is no catalog file http://docs.su.com/ab2/alluser/ADVOSUG/@xmlChunk/catalog * the dtd is not accessible at http://docs.su.com/ab2/alluser/ADVOSUG/@xmlChunk/113.dtd (and _this_ is where it is...) * the style sheet isn't there either (and _this_ is where it is...) We had to do this for SoftQuad Panorama for exactly this reason. For example, John Price-Wilkin served up the Middle English Corpus in SGML using PAT, but couldn't easily cope with Panorama looking for CATLOG or catalog in the middle of a database query In general, if you find yourself doing probes to see if files exist using http, you've probably made a design error somewhere, as this isn't a good use of http. So allow the processing instructions. Lee xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From bosak at atlantic-83.Eng.Sun.COM Tue Mar 18 16:35:09 1997 From: bosak at atlantic-83.Eng.Sun.COM (Jon Bosak) Date: Mon Jun 7 16:57:35 2004 Subject: Associating DSSSL style sheets with documents In-Reply-To: <9703180516.AA16683@sqrex.sq.com> (lee@sq.com) Message-ID: <199703181629.IAA28823@boethius.eng.sun.com> [Liam Quin:] | Note that if you are generating the XML from a CGi script, a Java | server plugin (e.g. Solaris 2.6's upcoming server) or otherwise, | you probably need to make sure that clients don't try to look for | files in the same "directory" as your SGML. | | E.g. http://docs.su.com/ab2/alluser/ADVOSUG/@xmlChunk/113.dsl | does a database query into presumably DynaBase (right, Jon?). No, DynaWeb. But your point is well taken. | So allow the processing instructions. When we start downloading a DSSSL stylesheet from the server, I think that this is probably the method we'll try first. Of all the alternatives, I like James Clark's last suggestion best for initial experimentation: Jon xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From lee at sq.com Tue Mar 18 18:48:30 1997 From: lee at sq.com (lee@sq.com) Date: Mon Jun 7 16:57:35 2004 Subject: Associating DSSSL style sheets with documents Message-ID: <9703181848.AA07514@sqrex.sq.com> > >E.g. http://docs.su.com/ab2/alluser/ADVOSUG/@xmlChunk/113.dsl > >does a database query into presumably DynaBase (right, Jon?). > > NO! This is DynaWeb! Sorry for the error -- I meant DynaWeb. Honest. > >In this case, you want a processing instruction (or some other markup) > >to say that > >* there is no catalog file > > http://docs.su.com/ab2/alluser/ADVOSUG/@xmlChunk/catalog > > You *could* generate a catalog, which would point at the DTD and the > stylesheets. If DynaWeb had been less powerful, or you (or Jon in this case!) less familiar with it, that may not have been an option -- with some other SGML databases I've seen, it'd be quite hard. One way would be to have a shell script front end that special-cases all files called "catalog" and returns a hard-wired catalog file... but even that isn't always easy in this world of automatically-generated CGI programs with special hooks into the servers, so you can't simply unhook them a little. So believe me (please!), there will be people, perhaps not using DynaWeb, who can't or won't put a catalog file in there. > >* the dtd is not accessible at > > http://docs.su.com/ab2/alluser/ADVOSUG/@xmlChunk/113.dtd > > (and _this_ is where it is...) > ... > >* the style sheet isn't there either > > (and _this_ is where it is...) > > [...] It would be quite possible to resolve all of the things you > outline above inside the configuration files (easy even). Now do it with Astoria, Documentum, Saros DM, Texel, etc., including handling a server login to fetch the catalog file, a server login to fetch the style sheet, a server login to fetch the DTD, and a bunch of impatient users. Yes, you coud say the web front ends could cache recent login connections so they didn't log in again each time, but generally they don't seem to do that. Then deal with systems that can't deal with the DTD inside the database. (if DTDs were in SGML format... but that's another issue) > >In general, if you find yourself doing probes to see if files exist > >using http, you've probably made a design error somewhere, as this > >isn't a good use of http. > > Agreed! Heh! > >So allow the processing instructions. > > Or use catalogs. Well, I'm not saying forbid catalogs, although I can't abide the thought of mandating all that code for XML-compliant application. I'm suggesting providing an alternative. Our experience with conneting Panorama with a wide range of databases has been that we needed to do this. Maybe if all the databases had been built by Gavin :-) we'd have been able to stick with Catalogs, and we'd always have known where to look for catalog even with URLs like http://www.xxx.zzz/bin/get-doc/40197&user=z305&pass=df4ec5c9&d=113&f=7 where d=113 is the document chunk ID, get-doc is the program, 40197 is a PATH_INFO parameter used for versioning, and the URL for CATALOG is http://www.xxx.zzz/bin/get-doc/40197&user=z305&pass=df4ec5c9&d=491&f=7 and no, I'm not making this up (except I've changed the field names from those used in any one particular currently shipping commercial system). Panorama's default algorithm would look for http://www.xxx.zzz/bin/get-doc/catalog which obviously won't work in this case. So we need to say where to find the CAALOG file so we can find where to find the DTD. Or, we put an explicit URL to the DTD. There's somewhere to do that in SGML, but not for a style sheet or a navspec/table of contents definition file, nor any other ancilliary non-SGML files. So we use processing instructions in those cases where it's necessary. Does that make a better case? If people end up saying no, it's clear that all the commercial applications will do this anyway, but each in their own incompatible way. I hereby volunteer us to be amongst the first :-) Lee -- Liam Quin, lee@sq.com | lq-text freely available Unix text retrieval Senior Technical Consultant | FAQs: Metafont fonts, OPEN LOOK UI, OpenWindows SoftQuad Inc. +1 416 544-9000 | xfonttool (Unix xfontsel in XView) http://www.softquad.com/ | the barefoot programmer xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From lee at sq.com Tue Mar 18 20:24:37 1997 From: lee at sq.com (lee@sq.com) Date: Mon Jun 7 16:57:35 2004 Subject: Associating DSSSL style sheets with documents Message-ID: <9703182024.AA10994@sqrex.sq.com> > Hmm. How much is "all that code"? I got 1443 lines of code for a catalog > parser in C++, including comments. Remember our dirty perl hacker and the graduate student who is supposed to be able to write an XMLparser in a week? That was a big goal initially. > >Our experience with conneting Panorama with a wide range of databases has > >been that we needed to do this. Maybe if all the databases had been > >built by Gavin :-) we'd have been able to stick with Catalogs, and we'd > >always have known where to look for catalog even with URLs like > > http://www.xxx.zzz/bin/get-doc/40197&user=z305&pass=df4ec5c9&d=113&f=7 > >where d=113 is the document chunk ID, get-doc is the program, 40197 is a > >PATH_INFO parameter used for versioning, and the URL for CATALOG is > > http://www.xxx.zzz/bin/get-doc/40197&user=z305&pass=df4ec5c9&d=491&f=7 > >and no, I'm not making this up (except I've changed the field names from > >those used in any one particular currently shipping commercial system). > > Hmm. Looks very much like Astoria to me. No, actually. > >Panorama's default algorithm would look for > > http://www.xxx.zzz/bin/get-doc/catalog > >which obviously won't work in this case. > > Depends on how smart get-doc is. Suppose it's written in C and hard-linked into the web server. Suppose it was supplied by a commercial vendor, and changing or replacing it invalidates the support contract for a $500,000 installation... > >So we need to say where to find the CATALOG file so we can find where to > >find the DTD. Or, we put an explicit URL to the DTD. There's somewhere > >to do that in SGML, but not for a style sheet or a navspec/table of contents > >definition file, nor any other ancilliary non-SGML files. So we use > >processing instructions in those cases where it's necessary. > > If you can do it using PI's, you can do it using catalogs. I don't believe this dogma :-) > The only real points in favor of PI's are: > > 1) It does simplify clients *a little* (no need for catalog parsing, > though resolution is still required). > 2) They're simpler to hack into a server. > > neither of which carries much technical weight. I don't care. If it's the difference between "our integration team can do this" and "our product development or serious programming team could do this" it's the difference between succeed and fail. So allow both, OK? Are you really so set against PIs here? Is there a (non-religious) reason? They could be significant comments if you prefer -- like httpd server side includes/execs... (ugh) Lee xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From paul at arbortext.com Tue Mar 18 21:58:59 1997 From: paul at arbortext.com (Paul Grosso) Date: Mon Jun 7 16:57:35 2004 Subject: Associating DSSSL style sheets with documents Message-ID: <9703182148.AA03271@atiaus.arbortext.com> > From: lee@sq.com > > Remember our dirty perl hacker and the graduate student who is supposed > to be able to write an XMLparser in a week? That was a big goal initially. For what it's worth... THe desperate perl hacker was someone trying to write a perl script to do some basic data massaging to some marked up XML. We never had as a goal that someone could write an XML parser in perl. As far as the grad student, I believe we were giving them two weeks to write an XML parser. Finally, let's not die on our own sword here. The main goal is to have XML be widely accepted. A subgoal of that is to make it relatively easy to write an XML parser, but it still has to be worthwhile to write that parser in the first place, or we've lost the war. I'm not saying that catalogs are absolutely required for XML to work, but I do think we need to look at the big picture, not count lines of code, to determine the right answer. paul xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From dseibert at sqwest.bc.ca Tue Mar 18 22:23:54 1997 From: dseibert at sqwest.bc.ca (David Seibert) Date: Mon Jun 7 16:57:35 2004 Subject: Associating DSSSL style sheets with documents Message-ID: <01BC33A7.DE903900@sqruffy.west.sq.com> "Design Principles for XML" actually says that "the holder of a CS bachelor's degree should be able to construct basic processing (parsing, if not validating) machinery in less than a week". Making that two weeks would be pretty significant slippage. More important: if you want XML to be widely accepted, you don't want to enforce complications that aren't necessary for everyone. Catalogs are useful, but they aren't so easy to implement, so a lot of people would prefer PIs as a less complicated alternative. James's suggestion for a PI form, is concise, has all of the necessary information, and is close to the HTML syntax to make the transition easier for HTML authors. I can't improve on that. David ---------- From: Paul Grosso Sent: Tuesday, March 18, 1997 1:48 PM To: xml-dev@ic.ac.uk Subject: Re: Associating DSSSL style sheets with documents > From: lee@sq.com > > Remember our dirty perl hacker and the graduate student who is supposed > to be able to write an XMLparser in a week? That was a big goal initially. For what it's worth... THe desperate perl hacker was someone trying to write a perl script to do some basic data massaging to some marked up XML. We never had as a goal that someone could write an XML parser in perl. As far as the grad student, I believe we were giving them two weeks to write an XML parser. Finally, let's not die on our own sword here. The main goal is to have XML be widely accepted. A subgoal of that is to make it relatively easy to write an XML parser, but it still has to be worthwhile to write that parser in the first place, or we've lost the war. I'm not saying that catalogs are absolutely required for XML to work, but I do think we need to look at the big picture, not count lines of code, to determine the right answer. paul xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From lee at sq.com Tue Mar 18 23:27:41 1997 From: lee at sq.com (lee@sq.com) Date: Mon Jun 7 16:57:35 2004 Subject: Associating DSSSL style sheets with documents Message-ID: <9703182327.AA20037@sqrex.sq.com> > > Remember our dirty perl hacker and the graduate student who is supposed > > to be able to write an XMLparser in a week? That was a big goal initially. > > THe desperate perl hacker was someone trying to write a perl script to > do some basic data massaging to some marked up XML. We never had as a > goal that someone could write an XML parser in perl. I neither said that nor implied it. > As far as the grad student, I believe we were giving them two weeks to > write an XML parser. I think it varied -- the main point was that it wasn't 3 months, I think, and the language has to be straight-forward, simple and self-contained enough the the grad student _wants_ to do the parser. > Finally, let's not die on our own sword here. The main goal is to have > XML be widely accepted. A subgoal of that is to make it relatively > easy to write an XML parser, but it still has to be worthwhile to > write that parser in the first place, or we've lost the war. Yes, that's true, I agree. Lee xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From nmikula at edu.uni-klu.ac.at Wed Mar 19 07:59:52 1997 From: nmikula at edu.uni-klu.ac.at (Norbert H. Mikula) Date: Mon Jun 7 16:57:36 2004 Subject: Associating DSSSL style sheets with documents References: <199703181629.IAA28823@boethius.eng.sun.com> Message-ID: <33301646.76D2@edu.uni-klu.ac.at> Jon Bosak wrote: > When we start downloading a DSSSL stylesheet from the server, I think > that this is probably the method we'll try first. Of all the > alternatives, I like James Clark's last suggestion best for initial > experimentation: > > I think that's ok, but it also creates a pain in my stomache. Does it it mean I have to fetch the stylessheet each time for each document instance ? The user agent to my DSSSL engine supports caching (with a primitive caching heuristic, I have to admit). Should I base the lookup on "href=" or could (should) we include a (F)PI so that it reads like : PS: In general, I am still not fond of PIs. But I have to admit that it is more a religious than a practical point of view. -- Best regards, Norbert H. Mikula ===================================================== = SGML, DSSSL, Intra- & Internet, AI, Java ===================================================== = mailto:nmikula@edu.uni-klu.ac.at = http://www.edu.uni-klu.ac.at/~nmikula ===================================================== xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From nmikula at edu.uni-klu.ac.at Wed Mar 19 08:00:36 1997 From: nmikula at edu.uni-klu.ac.at (Norbert H. Mikula) Date: Mon Jun 7 16:57:36 2004 Subject: Associating DSSSL style sheets with documents References: <01BC33A7.DE903900@sqruffy.west.sq.com> Message-ID: <33300FC1.5C7C@edu.uni-klu.ac.at> David Seibert wrote: > More important: if you want XML to be widely accepted, you don't want > to enforce complications that aren't necessary for everyone. Catalogs are > useful, but they aren't so easy to implement, Compared to other problems that I was (am) having, catalogs are *straightforward* to implement. All in all it takes you about three days to implement. > so a lot of people would > prefer PIs as a less complicated alternative. James's suggestion for a PI > form, > > is concise, has all of the necessary information, and is close to the HTML > syntax to make the transition easier for HTML authors. I can't improve on > that. Catalogs are very important concepts for other things as well. If somebody doesn't want to use catalogs, he doesn't have to. Allowing for catalogs doesn't really complicate the specs of XML and doesn't make it more difficult to learn it. > As far as the grad student, I believe we were giving them two weeks to > write an XML parser. :-) Assuming that you know the tools and the programming language that you are using, two or rather three weeks is a fair estimation for a non-validating XML processor with no support for catalogs and public identifiers. Yet another requirement is that we get a revision of spec with all the missing productions (mostly S) included and some of the productions fixed and/or clearified. > Finally, let's not die on our own sword here. The main goal is to have > XML be widely accepted. A subgoal of that is to make it relatively > easy to write an XML parser, but it still has to be worthwhile to > write that parser in the first place, or we've lost the war. I'm > not saying that catalogs are absolutely required for XML to work, > but I do think we need to look at the big picture, not count lines > of code, to determine the right answer. * Strongly Agree * -- Best regards, Norbert H. Mikula ===================================================== = SGML, DSSSL, Intra- & Internet, AI, Java ===================================================== = mailto:nmikula@edu.uni-klu.ac.at = http://www.edu.uni-klu.ac.at/~nmikula ===================================================== xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From dseibert at sqwest.bc.ca Wed Mar 19 17:24:34 1997 From: dseibert at sqwest.bc.ca (David Seibert) Date: Mon Jun 7 16:57:36 2004 Subject: Associating DSSSL style sheets with documents Message-ID: <01BC3447.1DC4F260@sqruffy.west.sq.com> We agree. I am certainly in favor of allowing catalogs, as long as PIs are also allowed, I just don't want to force people to use them. As far as parsing time, I agree on the estimate of 2-3 weeks for a non-validating parser with no catalog or public identifier handling. However, I think 1 week will not be unreasonable for someone who has learned to use yacc and lex (or some equivalents), _after_ the grammar is specified properly. I am spending about half of my time cleaning up incorrect productions (fortunately I have Peter Sharpe here to clarify what content is supposed to be allowed). When I have time to get it into readable shape (I'm not using standard yacc and lex, so I should normalize it), I'll post my corrected grammar to xml-dev. Regards, David ---------- From: Norbert H. Mikula Sent: Wednesday, March 19, 1997 8:09 AM To: David Seibert Cc: xml-dev@ic.ac.uk Subject: Re: Associating DSSSL style sheets with documents David Seibert wrote: > More important: if you want XML to be widely accepted, you don't want > to enforce complications that aren't necessary for everyone. Catalogs are > useful, but they aren't so easy to implement, Compared to other problems that I was (am) having, catalogs are *straightforward* to implement. All in all it takes you about three days to implement. > so a lot of people would > prefer PIs as a less complicated alternative. James's suggestion for a PI > form, > > is concise, has all of the necessary information, and is close to the HTML > syntax to make the transition easier for HTML authors. I can't improve on > that. Catalogs are very important concepts for other things as well. If somebody doesn't want to use catalogs, he doesn't have to. Allowing for catalogs doesn't really complicate the specs of XML and doesn't make it more difficult to learn it. > As far as the grad student, I believe we were giving them two weeks to > write an XML parser. :-) Assuming that you know the tools and the programming language that you are using, two or rather three weeks is a fair estimation for a non-validating XML processor with no support for catalogs and public identifiers. Yet another requirement is that we get a revision of spec with all the missing productions (mostly S) included and some of the productions fixed and/or clearified. > Finally, let's not die on our own sword here. The main goal is to have > XML be widely accepted. A subgoal of that is to make it relatively > easy to write an XML parser, but it still has to be worthwhile to > write that parser in the first place, or we've lost the war. I'm > not saying that catalogs are absolutely required for XML to work, > but I do think we need to look at the big picture, not count lines > of code, to determine the right answer. * Strongly Agree * -- Best regards, Norbert H. Mikula ===================================================== = SGML, DSSSL, Intra- & Internet, AI, Java ===================================================== = mailto:nmikula@edu.uni-klu.ac.at = http://www.edu.uni-klu.ac.at/~nmikula ===================================================== xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From dgd at cs.bu.edu Wed Mar 19 18:19:19 1997 From: dgd at cs.bu.edu (David Durand) Date: Mon Jun 7 16:57:36 2004 Subject: CATALOGs and stylesheets Message-ID: There was a proposed CATALOG extension (even imnplemented, I think) for a DOCUMENT(?) keyword that stated the starting for which the catalog applies. With delegation this presents an alternative mechanism that will not be fooled by mytery URLs. Each document with "attachments" has a catalog that gives its URL and gives its DTD and stylesheet(s). Delegation is used to make catalog management bearable for files that share public Identifiers, so that common stuff resides in a common catalog. Then the URL that you send is the CATALOG URL, not the document URL -- and you get a whole directory of the stuff you might need, with the potential for any mapping you want from URIs to URLs. I'm not a CATALOG zealot, but it's an approach that bears consideration. -- David _________________________________________ David Durand dgd@cs.bu.edu \ david@dynamicDiagrams.com Boston University Computer Science \ Sr. Analyst http://www.cs.bu.edu/students/grads/dgd/ \ Dynamic Diagrams --------------------------------------------\ http://dynamicDiagrams.com/ MAPA: mapping for the WWW \__________________________ xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From nmikula at edu.uni-klu.ac.at Wed Mar 19 19:03:22 1997 From: nmikula at edu.uni-klu.ac.at (Norbert Mikula) Date: Mon Jun 7 16:57:36 2004 Subject: CATALOGs and stylesheets In-Reply-To: Message-ID: On Wed, 19 Mar 1997, David Durand wrote: > There was a proposed CATALOG extension (even imnplemented, I think) for a > DOCUMENT(?) keyword that stated the starting for which the catalog applies. NXP supports Catalogs (including the DOCUMENT keyword). Best regards, Norbert H. Mikula ===================================================== = SGML, DSSSL, Intra- & Internet, AI, Java ===================================================== = mailto:nmikula@edu.uni-klu.ac.at = http://www.edu.uni-klu.ac.at/~nmikula ===================================================== xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From dseibert at sqwest.bc.ca Wed Mar 19 19:07:29 1997 From: dseibert at sqwest.bc.ca (David Seibert) Date: Mon Jun 7 16:57:36 2004 Subject: CATALOGs and stylesheets Message-ID: <01BC3455.B16BD570@sqruffy.west.sq.com> That sounds great, and mixes well with the PI approach. Authors who want a simple approach can just enclose the stylesheet URL in a PI inside the XML document. More sophisticated authors can do the same, and then label the entire document with a single catalog URL, and the separate chunks with different URLs. (I haven't read the CATALOG extension, so I hope that I am interpreting David's remarks correctly). The sophisticated server could give the full catalog entry to sophisticated clients, who would negotiate what to send; they would then presumably parse the PI (they need to do this to deal with simple servers) and realize that they already had the stylesheet. Unsophisticated clients could just get the XML text by default if they requested the base document, and then request the remaining chunks that they wanted. Thus, the presence of the catalog could be made transparent to unsophisticated clients. This separation of the PI and catalog mechanisms (keeping one internal to the XML document and the other external) allows simple clients and servers to peacefully coexist with sophisticated ones, with graceful degradation of functionality. It's probably more appropriate as well, since clients sophisticated enough to deal with the catalog should realize that the document is really the whole collection of files, not just the XML file. Is there a compelling argument to make catalogs visible to users? Regards, David Seibert ---------- From: David Durand Sent: Wednesday, March 19, 1997 9:05 AM To: xml-dev@ic.ac.uk Subject: CATALOGs and stylesheets There was a proposed CATALOG extension (even imnplemented, I think) for a DOCUMENT(?) keyword that stated the starting for which the catalog applies. With delegation this presents an alternative mechanism that will not be fooled by mytery URLs. Each document with "attachments" has a catalog that gives its URL and gives its DTD and stylesheet(s). Delegation is used to make catalog management bearable for files that share public Identifiers, so that common stuff resides in a common catalog. Then the URL that you send is the CATALOG URL, not the document URL -- and you get a whole directory of the stuff you might need, with the potential for any mapping you want from URIs to URLs. I'm not a CATALOG zealot, but it's an approach that bears consideration. -- David _________________________________________ David Durand dgd@cs.bu.edu \ david@dynamicDiagrams.com Boston University Computer Science \ Sr. Analyst http://www.cs.bu.edu/students/grads/dgd/ \ Dynamic Diagrams --------------------------------------------\ http://dynamicDiagrams.com/ MAPA: mapping for the WWW \__________________________ xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From bosak at atlantic-83.Eng.Sun.COM Wed Mar 19 20:40:18 1997 From: bosak at atlantic-83.Eng.Sun.COM (Jon Bosak) Date: Mon Jun 7 16:57:36 2004 Subject: Associating DSSSL style sheets with documents In-Reply-To: <33301646.76D2@edu.uni-klu.ac.at> (nmikula@edu.uni-klu.ac.at) Message-ID: <199703192038.MAA29744@boethius.eng.sun.com> [Norbert Mikula:] | > | | I think that's ok, but it also creates a pain in | my stomache. Does it it mean I have to fetch the stylessheet | each time for each document instance ? I was assuming (naively?) that the target of the href would be cached just like the target of any other URL. Jon xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From bosak at atlantic-83.Eng.Sun.COM Wed Mar 19 21:09:35 1997 From: bosak at atlantic-83.Eng.Sun.COM (Jon Bosak) Date: Mon Jun 7 16:57:36 2004 Subject: XML hot from the oven Message-ID: <199703192108.NAA29825@boethius.eng.sun.com> I pointed to the GET method for accessing the XML on docs.sun.com because I was assuming that an experimenter would next want to implement some kind of client to handle the data stream. If all you want to do is view the output, then you can just feed the equivalent URLs, e.g. http://docs.sun.com/ab2/@xmlToc http://docs.sun.com/ab2/alluser/ADVOSUG/@xmlChunk to any ordinary Web browser and download the results to a file. Jon ---------------------------------------------------------------------- Jon Bosak, Online Information Technology Architect, Sun Microsystems ---------------------------------------------------------------------- 2550 Garcia Ave., MPK17-101, Mountain View, California 94043 Davenport Group::SGML Open::NCITS V1::ISO/IEC JTC1/SC18/WG8::W3C XML Here's a little game you can all join in with It's very simple and I hope it's new Make your own tags up if you want to Any old tags that you think will do ---------------------------------------------------------------------- xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From nmikula at edu.uni-klu.ac.at Thu Mar 20 07:32:32 1997 From: nmikula at edu.uni-klu.ac.at (Norbert H. Mikula) Date: Mon Jun 7 16:57:36 2004 Subject: Associating DSSSL style sheets with documents References: <199703192038.MAA29744@boethius.eng.sun.com> Message-ID: <33315F40.2AD6@edu.uni-klu.ac.at> Jon Bosak wrote: > | > > | > | I think that's ok, but it also creates a pain in > | my stomache. Does it it mean I have to fetch the stylessheet > | each time for each document instance ? > > I was assuming (naively?) that the target of the href would be cached > just like the target of any other URL. I think my suggestion with the (formal) public identifier is more general. Your suggestion would work of course, but if we have two URLs, for instance, http://www.jon.com/foo.dsl and http://www.norbert.com/foo.dsl, they could be the same stylesheet but they don't *have* to be. Also extracting the stylesheet name as such wouldn't be the best solution (foo.dsl) as there also might be ambiguities. Your foo.dsl is not necessarily my foo.dsl. However, if I cache "-//NHM//FOO STYLE//EN", then especially with formal public idents, I (normally) wouldn't have these problems. Right ? -- Best regards, Norbert H. Mikula ===================================================== = SGML, DSSSL, Intra- & Internet, AI, Java ===================================================== = mailto:nmikula@edu.uni-klu.ac.at = http://www.edu.uni-klu.ac.at/~nmikula ===================================================== xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From bosak at atlantic-83.Eng.Sun.COM Thu Mar 20 16:57:10 1997 From: bosak at atlantic-83.Eng.Sun.COM (Jon Bosak) Date: Mon Jun 7 16:57:36 2004 Subject: Associating DSSSL style sheets with documents In-Reply-To: <33315F40.2AD6@edu.uni-klu.ac.at> (nmikula@edu.uni-klu.ac.at) Message-ID: <199703201655.IAA12043@boethius.eng.sun.com> [Norbert Mikula:] | > | > | > | | > | I think that's ok, but it also creates a pain in | > | my stomache. Does it it mean I have to fetch the stylessheet | > | each time for each document instance ? | > | > I was assuming (naively?) that the target of the href would be cached | > just like the target of any other URL. | | I think my suggestion with the (formal) public identifier is | more general. Your suggestion would work of course, but | if we have two URLs, for instance, http://www.jon.com/foo.dsl and | http://www.norbert.com/foo.dsl, they could be the same | stylesheet but they don't *have* to be. Remember, all I was asking for in the first place was input into how some of us could start doing this on an experimental basis. It isn't up to this group to develop a standard solution; that's the job of the W3C SGML working group. I think that the form above will work for an initial experiment, and unless someone sees a basic problem with it, that's what I'm going to try. Jon xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From ht at cogsci.ed.ac.uk Fri Mar 21 12:01:00 1997 From: ht at cogsci.ed.ac.uk (Henry S. Thompson) Date: Mon Jun 7 16:57:36 2004 Subject: Two more points for cleanup in existing draft Message-ID: <3711.199703211200@grogan.cogsci.ed.ac.uk> 1a) Shouldn't the two occurences of '<' in production 16 (the definition of QuotedCData) be replaced with '&', and if not, why not? 1b) Shouldn't production 15 (the definition of Literal) prohibit '&' and '%' as well as the relevant quote character, for consistency with [16]? 2) 4.3, the discussion of entity treatment, is somewhat unsatisfactory. '[P]arsed character data' is misleading, since by the syntax PCData cannot contain references! If it means 'content and QuotedCData' (which are the places entity references are allowed), it should say so. Also, parameter entity processing is not discussed at all. 4.3.6 also needs careful attention, since as it stands it doesn't give enough weight to the consequences of 2.1, and might lead the naive to suppose that ". . .three companies: L&M; B&W; Imperial Tobacco" is invalid, presuming M and W are not themselves defined as entities. Indeed taken literally 4.3.6 might lead one to suppose that ANY use of & is illegal, since PCData may not contain &, and 4.3.6 says "processing this replacement data (which may contain both text and markup) . . ." This needs to be clarified, in my view. Here's a candidate redraft of the relevant bits: -------------- 4.3 XML allows character or general entity references in two places, namely in Element content ([39]) or Quoted character data ([16]). The names of external binary entities may also appear as/in the value of an ENTITY or ENTITIES attribute. On encountering one of these references, an XML processor shall: . . . 2. For both character and entity references, the processor must not pass the reference itself to the application. 3. For character references, the processor must pass the indicated ISO 10646 bit pattern to the application in place of the reference. . . . 6. For an internal (text) entity, the processor should process the defined content of the reference on the same basis (i.e. as content or QuotedCData) that licensed the reference in the first place, with due regard to section 2.1 above, and pass the result to the application in place of the reference, EXCEPT that the content of references processed as QuotedCData MAY include single or double quotes ad lib., or may consist of a single '&' character. Similarly, the content of references processed as 'content' MAY consist of either a single '<' character or a single '&' character. . . . If the processor includes an external text entity under clauses (7) or (8) above, the results shall be as for internal (text) entities as defined in (6). . . . XML allows parameter entity references in three places, namely in literals ([15]), the internal declaration subset ([33]) or the key of a conditional section ([58]). Processing in this case is parallel to that for internal (text) entities as defined in clause (6) above, with the obvious extension to allow content consisting of a single '%' character. --------------- Note the use of the label 'content' for production [39] is extremely infelicitous. The bit about parameter entity references is important, as it makes clear that the following is valid XML (as it is SGML): '> %yy; ]> a &g; b [nsgml says: (FOO -a f b )FOO C ] Hope this helps. ht xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From nmikula at edu.uni-klu.ac.at Fri Mar 21 14:44:08 1997 From: nmikula at edu.uni-klu.ac.at (Norbert Mikula) Date: Mon Jun 7 16:57:36 2004 Subject: Two more points for cleanup in existing draft In-Reply-To: <3711.199703211200@grogan.cogsci.ed.ac.uk> Message-ID: > The bit about parameter entity references is important, as it makes > clear that the following is valid XML (as it is SGML): > > > > '> > %yy; > ]> > a &g; b > > [nsgml says: > (FOO > -a f b > )FOO > C > ] I ran it with NXP. I had to make a few changes : 1.) ---> (The ERB hasn't decided yet on this subject, or has it ?) 2.) I had to change the position of yy and zz (I wasn't thinking about this problem of refering to an entity that was not yet declared. Now I need to check carefully when NXP resolves (is supposed to resolve) entity references) 3.) --> (Note the semicolon after zz !) After these changes I got the same results. FYI: I tested with latest release. It has not been published yet. Best regards, Norbert H. Mikula ===================================================== = SGML, DSSSL, Intra- & Internet, AI, Java ===================================================== = mailto:nmikula@edu.uni-klu.ac.at = http://www.edu.uni-klu.ac.at/~nmikula ===================================================== xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From Peter at ursus.demon.co.uk Fri Mar 21 23:06:03 1997 From: Peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:57:36 2004 Subject: Uncertainties in implementing WD-xml.html Message-ID: <4909@ursus.demon.co.uk> In message <3332D971.68F5@utila.ifi.uni-klu.ac.at> "Norbert H. MIKULA" writes: > As far as I can remember the ERB has initially decided > to change the syntax for comments to > > <--* ..... *--> > > (posted to the XML-WG : Wed, 15 Jan) > > But the torture.xml file of cmsmcq uses > > also during the discussion about the appropriate > regexp people used both alternatives. > > What is the current state of things ? I share Norbert's concern about uncertainties in the XML draft and feel that a number of us are 'stalled' at present due to one or more uncertainties in the spec. (It may be that these are simple misconceptions, but they need tidying up.). We agree that the mythical grad student can hack a parser in the mythical two weeks, but only if they have a clear spec to write to. [My own position is that I want to extend JUMBO to read any WF XML file and intend to do this on top of another parser, and I'd like to do this before WWW6 - otherwise it can't be said to be an 'XML browser/editor'.] My understanding is that the productions (1-77) are consistent and can be used as the basis of a yacc-like approach (as NXP does, using JACC). So the first question is [see Norbert's query]: (a) are we agreed that (1-77) in WD-xml-961114 are the current version and that none are under revision at present? (Until Norbert's question I had assumed that [21] (Comments) was correct). (b) some parsing operations (e.g. entity replacement) are not described in the BNF and are sufficiently complex or insufficiently documented to give serious problems in implementation. It would be valuable for these to be listed and the operations clearly defined (e.g. are comments processed before entity replacement? are nested entities allowed? etc.) (c) some ancillary constructs (e.g. CATALOG) are widely held to be part of XML (or likely to be part of XML). They are probably not too difficult to implement if certain processes (e.g. resolution of FPIs) are not exhaustively defined. IMO it is more important to resolve this asap, than other aspects of developing a parser. The worst possible thing to happen at this stage is that developers have sufficient uncertainty in the spec that there are different interpretations Against normal practice I have crossposted this to xml-dev. If the ERB feel this is mainly a matter of clarification, then a reply to xmp-dev would be adequate, but if (as I fear) some aspects of entity replacement are not universally agreed, then I think they need to be resolved here. P -- Peter Murray-Rust, domestic net connection Virtual School of Molecular Sciences http://www.vsms.nottingham.ac.uk/ xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From Peter at ursus.demon.co.uk Fri Mar 21 23:53:47 1997 From: Peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:57:36 2004 Subject: XML hot from the oven Message-ID: <4912@ursus.demon.co.uk> In message <199703192108.NAA29825@boethius.eng.sun.com> bosak@atlantic-83.Eng.Sun.COM (Jon Bosak) writes: > I pointed to the GET method for accessing the XML on docs.sun.com > because I was assuming that an experimenter would next want to > implement some kind of client to handle the data stream. If all you > want to do is view the output, then you can just feed the equivalent > URLs, e.g. > > http://docs.sun.com/ab2/@xmlToc > http://docs.sun.com/ab2/alluser/ADVOSUG/@xmlChunk > > to any ordinary Web browser and download the results to a file. Or you could use an XML tool as a helper application for a browser. This could be done in a .mailcap file or by configuring the browser. For JUMBO the .mailcap file looks like text/xml; java pmr.chemime.ChemTree %s chemical/x-cml and this should be able to deal with Jon's Shakespeare. (Unfortunately JUMBO doesn't deal with all XML constructs yet). P. Note, of course, that the default view of any XML browser may not be very informative. PLAY comes out very well in JUMBO, but the Solaris docs would need subclasses written for several of the elements. P. -- Peter Murray-Rust, domestic net connection Virtual School of Molecular Sciences http://www.vsms.nottingham.ac.uk/ xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From bosak at atlantic-83.Eng.Sun.COM Sat Mar 22 01:50:56 1997 From: bosak at atlantic-83.Eng.Sun.COM (Jon Bosak) Date: Mon Jun 7 16:57:36 2004 Subject: docs.sun: changes to XML output Message-ID: <199703220149.RAA14332@boethius.eng.sun.com> We've been making changes to the TOC output from docs.sun.com. A bug in container closing has been fixed (thanks, Norbert), and we've adopted a convention for properly structuring TOC output so that the TOC always forms a single tree. Thus, for example, http://docs.sun.com/ab2/alluser/ADVOSUG/@xmlToc will give you ... ... ... http://docs.sun.com/ab2/alluser/@xmlToc will give you ... ... ... and http://docs.sun.com/ab2/@xmlToc will give you ... ... ... ... ... ... ... ... ... ... ... ... That's the idea, anyway. By the way, if you start wondering what Sun thinks this server is for, try http://docs.sun.com all by itself. Jon xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From Peter at ursus.demon.co.uk Sat Mar 22 10:30:45 1997 From: Peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:57:36 2004 Subject: Lark Message-ID: <4927@ursus.demon.co.uk> Tim, and xml-dev'ers I am using the Jan 3 version of Lark. Is there a later version? If so, the rest of this posting may be ignored at present. [If the errors are due to my incompetence, please be gentle :-)] I have some problems at the start of the document: If I include the magic incantation: then doDoctype(Entity e, String rootID, String externalSubsetID) sets rootId to VERSION="1.0". If I comment out the statement, then it performs as I would expect. If I run Lark on a file with no SYSTEM or PUBLIC in the DOCTYPE it throws an error. ---------------------------------------------------------------- Please don't anything here as a criticism of Lark... (or NXP, or any other pasrser that might appear shortly). I think it's very important that by WWW6 NXP and Lark are able to read a wide range of examples without errors. The primary task is that we make sure that we all agree on how to read well-formed files. If someone writes a DTDless 'XML' file and brings it to WWW6 then either: - it should parse without errors on all parsers - all parsers should inform of at least one error (I assume that parser developers are *allowed* to stop at the first error, however incovenient this mighty be.) Ideally we should be able to read torture files uniformly, though I suspect that certain bizarre constructs can still be created which throw most parsers. P. -- Peter Murray-Rust, domestic net connection Virtual School of Molecular Sciences http://www.vsms.nottingham.ac.uk/ xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From jjc at jclark.com Sat Mar 22 13:09:24 1997 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 16:57:37 2004 Subject: Associating DSSSL style sheets with documents Message-ID: <2.2.32.19970322125754.0076a558@jclark.com> At 08:29 18/03/97 -0800, Jon Bosak wrote: >| So allow the processing instructions. > >When we start downloading a DSSSL stylesheet from the server, I think >that this is probably the method we'll try first. Of all the >alternatives, I like James Clark's last suggestion best for initial >experimentation: > > I've just implemented this in Jade. For the benefit of others implementing DSSSL or XML here are the details: - I recognize the PI anywhere in the prolog (so you can put it an external DTD). - When there are multiple such PIs, I give the first precedence. - I allow any of text/dsssl, text/x-dsssl, application/dsssl and application/x-dsssl for the type. The type is case insensitive. - I recognize I also plan to implement something to allow catalogs to be used as an alternative to PIs, but I haven't decided what yet. James xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From Peter at ursus.demon.co.uk Sat Mar 22 19:46:38 1997 From: Peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:57:37 2004 Subject: JUMBO Message-ID: <4939@ursus.demon.co.uk> JUMBO is a prototype browser/editor/search/transformation tool for XML documents. I have now managed to bolt in both Lark and NXP instead of my parser (which was crude and did not support some of the XML constructs). The bolting-in is still rather crude and concentrates my mind on the need for a simple API at this level. Here are some comments which may be useful. NXP. ---- NXP has an interface Esis, with function such as open_tag, close_tag, process_instruction, etc. [I think they would be more properly called start_element??]. JUMBO uses this to build up a Vector representing the ESIS event stream, somthing like: "_START_TAG" "CML" AttributeList "_START_TAG" "MOL" ... "_END_TAG" "MOL"... JUMBO then builds a tree out of this, adding attributes, etc. NXP has a class XML which is built by JACC. This contains inter alia an Esis_Stdout object (implements Esis). There are several objects in XML which are private and therefore not easily accessed - I think they should have accessors, but at present I have subclassed it to PMRXML, which has the requisiste accessors. My test program then creates a PMRXML object, and extracts the event stream which is then passed to JUMBO's existing tree object: NXP.PMRXML xml = new PMRXML(NXP.Streams.load_File(file, true)); pmr.chemime.ChemTree chemTree = new ChemTree(xml.getStreamVector()); pmr.sgml.GeneralTOC toc = chemTree.createGeneralTOC(3); Comments: I have still to work out what whitespace NXP creates - there seems to be a lot of content which is simply white. Maybe we have to address COLLAPSE and KEEP at this stage? Also it isn't easy to extract certain info - for example I had to hack XML.java to get the doctype - this isn't a good idea and we need an accessor. I am also still not clear how NXP does (or should) behave with: and (the default on the latter is to try to validate, I think, even if validate is set to false. I'd prefer to be able to turn off validation, but I may have missed something). In general I'd like to be able to treat NXP as a black box, and subclass my Esis object. That could mean passing it as an argument to XML, e.g.: public class PMREsis implements Esis { public void open_tag(String name) { ... } } PMREsis esis = new PMREsis(); NXP.XML xml = new NXP.XML(esis, NXP.Streams.load_File(file, true)) pmr.sgml.SGMLTree tree = new pmr.sgml.SGMLTree(xml); and so on. NXP is a validatin parser, but my DTDs are still struggling with Parameter Entities so I have no experience here. Lark ---- Lark creates a tree (called Lark) and provides a handler for the user to pick up a variety of events (e.g. doDoctype(), doPI()). The tree contains Elements ('Nodes') which have Attributes and a type (String). Rather than subclassing these elements, I process Lark but iterating through the Elements and creating a JUMBO SGMLTree (this can be delayed if required). The tree seems complete, but I am not sure I have got all the doFOO routines working correctly. I have also had problems with PIs (if the ?> delimiter is used) - these may be mine. Lark does not validate. However it is easy to interface and is fast. General ------- I do not use PIs myself though I shall start to do so. If they are kept in the document tree, is there a convention where they live? (The last opened element? What if they occur in PCDATA?). I intend to make JUMBO available with both Lark and NXP but it's a bit creaky at present and the interface is a bit slow. I have been told that the larger the number of classes, the slower the program - any comments? Also I don't know whether I should be deliberately garbage-collecting at this stage. Any general thoughts would be welcome. I intend to bolt a crude search tool into JUMBO along the TEI lines. I shall also see whether I can extract the bits of NXP that do the validating, because then we have a crude validating editor. Any feedback from the current JUMBos would be appreciated. (I already know it's slow, and the graphics creak in several places :-) P. -- Peter Murray-Rust, domestic net connection Virtual School of Molecular Sciences http://www.vsms.nottingham.ac.uk/ xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From tbray at textuality.com Sat Mar 22 20:24:47 1997 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 16:57:37 2004 Subject: Lark Message-ID: <3.0.32.19970322122220.009b8890@pop.intergate.bc.ca> At 10:11 AM 3/22/97 GMT, Peter Murray-Rust wrote: >Tim, and xml-dev'ers >I am using the Jan 3 version of Lark. Is there a later version? >If so, the rest of this posting may be ignored at present. I will be posting another version of Lark this weekend. It handles CMSMQ's Torture doc, and has dozens of bugs fixed. Let's have another look Monday. -Tim xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From tbray at textuality.com Sat Mar 22 22:16:05 1997 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 16:57:37 2004 Subject: Uncertainties in implementing WD-xml.html Message-ID: <3.0.32.19970322141420.009c3430@pop.intergate.bc.ca> >I share Norbert's concern about uncertainties in the XML draft For the record: us too. Our feeling is that, as Norbert suggests, once WWW6 hits, XML is de facto frozen because there will be more than just our little family doing implementations. As a result, Michael S-McQ and I, and the ERB, are plowing through all the issues like mad; Michael and I spent a half-day together Friday plowing through all the little syntax errors and style problems that people sent in; Murata Makoto is the best proof-reader of all, by the way. The right thing to do is pretty clear in almost all cases (except bloody horrible parameter entities) and right or wrong, we *must* have a solid spec by March 31... the decisions are almost certainly going to disappoint some of you, but at least we'll have the virtue of the thing being solid. - Tim xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From tbray at textuality.com Mon Mar 24 03:36:28 1997 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 16:57:37 2004 Subject: Lark, V0.88 Message-ID: <3.0.32.19970323193500.009c5de0@pop.intergate.bc.ca> I have just promoted Lark V0.88 to: http://www.textuality.com/Lark/ I have received a long-names zip, so please ignore the comments asking for one... some weird bug in my win95 is keeping it from working. What's new: - dozens of bug fixes (now passes CMSMQ's Torture.xml, among others, passes lots of Jon's docs.sun.com stuff, except the ones that are not well-formed - full default attribute processing, as a result of which, there's another 15k of code - does new 몾 unicode character refs - does new comments - it's twice as big. Still to do: - parameter entities (yeccch) - make it into a Java package - make it into an applet - spruce up the unicode - do entities in attribute values Cheers, Tim Bray tbray@textuality.com http://www.textuality.com/ +1-604-708-9592 xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From nmikula at edu.uni-klu.ac.at Mon Mar 24 08:21:05 1997 From: nmikula at edu.uni-klu.ac.at (Norbert H. Mikula) Date: Mon Jun 7 16:57:37 2004 Subject: JUMBO References: <4939@ursus.demon.co.uk> Message-ID: <3336AFF1.1A23@edu.uni-klu.ac.at> Peter Murray-Rust wrote: > NXP has an interface Esis, with function such as open_tag, close_tag, > process_instruction, etc. [I think they would be more properly called > start_element??]. You are absolutely right ! > JUMBO uses this to build up a Vector representing the > ESIS event stream, somthing like: > "_START_TAG" "CML" AttributeList "_START_TAG" "MOL" ... "_END_TAG" "MOL"... > JUMBO then builds a tree out of this, adding attributes, etc. > > NXP has a class XML which is built by JACC. This contains inter alia > an Esis_Stdout object (implements Esis). There are several objects in XML > which are private and therefore not easily accessed - Would it be possible to send me a list of those objects ? > I think they should > have accessors, but at present I have subclassed it to PMRXML, which has > the requisiste accessors. > > My test program then creates a PMRXML object, and extracts the event stream > which is then passed to JUMBO's existing tree object: > NXP.PMRXML xml = new PMRXML(NXP.Streams.load_File(file, true)); > pmr.chemime.ChemTree chemTree = new ChemTree(xml.getStreamVector()); > pmr.sgml.GeneralTOC toc = chemTree.createGeneralTOC(3); > > Comments: I have still to work out what whitespace NXP creates - there seems > to be a lot of content which is simply white. Maybe we have to address > COLLAPSE and KEEP at this stage? As soon as I will know how the standard defines the treatment of whitespace in all those scenarios, for instance w/ DTD w/o DTD, in element content etc. I will implement it that way. (I admit that the whitespace is really annoying, but I didn't want to waste my time with experiments.) > Also it isn't easy to extract certain > info - for example I had to hack XML.java to get the doctype - this isn't a good > idea and we need an accessor. People didn't seem to be too interested in my idea of an interface for passing along a complete grove. At least I didn't get too much feedback. > I am also still not clear how NXP does (or should) > behave with: > > and > (the default on the latter is to try to validate, I think, even if validate > is set to false. I'd prefer to be able to turn off validation, but I may have > missed something). I will check it. Thank's for pointing it out to me ! > In general I'd like to be able to treat NXP as a black box, and subclass > my Esis object. That could mean passing it as an argument to XML, e.g.: > > public class PMREsis implements Esis { > public void open_tag(String name) { > ... > } > } > > PMREsis esis = new PMREsis(); > NXP.XML xml = new NXP.XML(esis, NXP.Streams.load_File(file, true)) > pmr.sgml.SGMLTree tree = new pmr.sgml.SGMLTree(xml); That's the basic idea that I had in mind. We really must continue with working on our unified interface for XML/Java based applications. -- Best regards, Norbert H. Mikula ===================================================== = SGML, DSSSL, Intra- & Internet, AI, Java ===================================================== = mailto:nmikula@edu.uni-klu.ac.at = http://www.edu.uni-klu.ac.at/~nmikula ===================================================== xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From Peter at ursus.demon.co.uk Mon Mar 24 09:51:08 1997 From: Peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:57:37 2004 Subject: JUMBO Message-ID: <4975@ursus.demon.co.uk> In message <3336AFF1.1A23@edu.uni-klu.ac.at> "Norbert H. Mikula" writes: > Peter Murray-Rust wrote: > > NXP has an interface Esis, with function such as open_tag, close_tag, > > process_instruction, etc. [I think they would be more properly called > > start_element??]. > > You are absolutely right ! I learnt the importance of precise terminology when Erik Naggum was a regular contributor to c.t.s.:-) :-) He used to point out gently but firmly any lapse in terminology. The problem with SGML is that its terminology is sufficiently different from other disciplines that people make guesses and also don't realise the distinctions matter. (I have been very guilty in this respect). However there are a number of areas where the distinction is subtle - I still don't know if there is a difference between 'GI' and 'Element type', for example. There is no doubt that adherence to the agreed terminology is a key aspect of the API. > > > JUMBO uses this to build up a Vector representing the > > ESIS event stream, somthing like: > > "_START_TAG" "CML" AttributeList "_START_TAG" "MOL" ... "_END_TAG" "MOL"... > > JUMBO then builds a tree out of this, adding attributes, etc. > > > > NXP has a class XML which is built by JACC. This contains inter alia > > an Esis_Stdout object (implements Esis). There are several objects in XML > > which are private and therefore not easily accessed - > > Would it be possible to send me a list of those objects ? /** start of NXP/PMR list */ // from NXP with PMR comments package NXP; //... public class XML implements XMLConstants { // PMR - I guess most of these would be valuable. // note that unless they are 'protected' they can't be acccessed // by a subclass from another package // '//?' means that I don't know what they are for yet (I haven't spent // time looking :-) // '//+' means I need them // '//+?' means I think I might need them :-) //+? XMLCatalogMain catalog = null; //? boolean start = true; //? int state_counter = 0; //+ static protected boolean validate = false; //+ static protected boolean talkative = false; //? final static int NO_SWITCH = -1; final static protected int ALL = 0; final static protected int INTERNAL = 1; final static protected int NONE = 2; static protected int rmd = ALL; //+ final static protected Esis_Stdout esis = new Esis_Stdout(); //+? final static protected Hashtable element_hash = new Hashtable(30); //+? final static protected Hashtable open_element_hash = new Hashtable(); //+? final static protected Hashtable notation_hash = new Hashtable(5); //+? final static protected Hashtable id_hash = new Hashtable(100); //+? final static protected Hashtable idref_hash = new Hashtable(100); //? static protected Element open_el = null; //? static protected Vector att_val = new Vector(); //? final static protected Hashtable found_attributes = new Hashtable(); //? final static protected Hashtable gen_entity_hash = new Hashtable(10); //? final static protected Hashtable par_entity_hash = new Hashtable(10); //? final static protected Stack lexer_stack = new Stack(); //? final static protected Stack openel_stack = new Stack(); //? static protected String stop_external = null; //? final static int GENERAL = 0; final static int PARAMETER = 1; final static Element NULL_ELEMENT = new Element(); //+ static String base_url; //+ static String base_path; //+ static boolean base = true; //+ final static int URL_INPS = 0; //+ final static int FILE_INPS = 1; //+ static int input_stream; //? final static Object DUMMY = new Object(); //+ (I had tp add this to the XML code :-( protected String pmrDoctype; final void popTokenManager() { XMLTokenManager tok_man = (XMLTokenManager) lexer_stack.pop(); ReInit(tok_man); } //+ (Note that this is NOT accessible to a subclass, and as it is final // cannot be overridden) final void setCatalog(XMLCatalogMain catalog) { this.catalog = catalog; } .... // This was my own class PMRXML, which I added to NXP. package NXP; import java.io.InputStream; import java.util.Vector; import NXP.Catalog.XMLCatalogMain; public class PMRXML extends XML { public PMRXML(InputStream is) { super(is); } public void setTalkative(boolean t) { talkative = t; } public void setValidate(boolean t) { validate = t; } public static int FILE_INPS() { return XML.FILE_INPS; } public static int URL_INPS() { return XML.URL_INPS; } public void setBaseUrl(String u) { base_url = u; } public String getBaseUrl() { return base_url; } public void setBasePath(String p) { base_path = p; } public String getBasePath() { return base_path; } public void setBase(boolean b) { base = b; } public void setInputStream(int is) { input_stream = is; } // the 'junk' was to avoid the same signature as setCatalog above // which is 'final' public void setCatalog(XMLCatalogMain c, String junk) { this.catalog = c; } public Vector getStreamVector() { return esis.vector; } public String getDoctype() { // this was just to get it to run. if (pmrDoctype == null) pmrDoctype = "CML"; return pmrDoctype; } } /** end of NXP/PMR */ > [...] > > > > Comments: I have still to work out what whitespace NXP creates - there seems > > to be a lot of content which is simply white. Maybe we have to address > > COLLAPSE and KEEP at this stage? > > As soon as I will know how the standard defines the treatment of > whitespace > in all those scenarios, for instance w/ DTD w/o DTD, in element content > etc. > I will implement it that way. (I admit that the whitespace is really > annoying, but > I didn't want to waste my time with experiments.) Agreed. I have (pragmatically) deleted all elements from NXP which consist only of whitespace. (This because my DTDs are biassed to this since the chance of getting a molecular scientist to know and love the SGML whitespace/RE/RS rules is outwith the 2nd law of thermodynamics. > > > Also it isn't easy to extract certain > > info - for example I had to hack XML.java to get the doctype - this isn't a good > > idea and we need an accessor. > > People didn't seem to be too interested in my idea of an interface for > passing along a complete grove. At least I didn't get too much > feedback. (a) some people (e.g. me) didn't know what a complete grove was :-) (b) I think we were worried about overkill before we have got the plane off the ground. (c) I am not sure I would recognise a doctype within a complete grove :-) most of the names seemed to have come out of a FORTRAN program (i.e. 6 consonants) > > > I am also still not clear how NXP does (or should) > > behave with: > > > > and > > (the default on the latter is to try to validate, I think, even if validate > > is set to false. I'd prefer to be able to turn off validation, but I may have > > missed something). > > I will check it. Thank's for pointing it out to me ! Great. > > > In general I'd like to be able to treat NXP as a black box, and subclass > > my Esis object. That could mean passing it as an argument to XML, e.g.: > > > > public class PMREsis implements Esis { > > public void open_tag(String name) { > > ... > > } > > } > > > > PMREsis esis = new PMREsis(); > > NXP.XML xml = new NXP.XML(esis, NXP.Streams.load_File(file, true)) > > pmr.sgml.SGMLTree tree = new pmr.sgml.SGMLTree(xml); > > That's the basic idea that I had in mind. We really must continue with > working on our unified interface for XML/Java based applications. Splendid. NXP has behaved fine on my document instances (but they aren't torturing it!) It also seems to be fast - at least *much* faster than my own stuff. Partly that is due to building a tree which I think hammers the memory, so anything that helps at parse time would be useful. The only thing I can recall that the WG might consider is character entities NXP announces that > cannot be resolved. My own feeling is that parsers should be at liberty to insert these as a default option (perhaps a commandline switch '-e assume that '> is included). -- Peter Murray-Rust, domestic net connection Virtual School of Molecular Sciences http://www.vsms.nottingham.ac.uk/ xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From Ingo.Macherius at tu-clausthal.de Mon Mar 24 11:02:05 1997 From: Ingo.Macherius at tu-clausthal.de (Ingo Macherius) Date: Mon Jun 7 16:57:37 2004 Subject: Writing about XML Message-ID: <199703241101.MAA21321@florix.rz.tu-clausthal.de> I am preparing a newspaper article on XML for a well known German computer magazine. The editor mentioned a W3C press release on the topic, which I can�t find anywhere. Can someone point me to it/send it ? Have I missed any pointers that can�t be found on www.w3.org, www.sil.org or www.textuality.com ? Thanks in advance, ++im -- Snail : Ingo Macherius // L'Aigler Platz 4 // D-38678 Clausthal-Zellerfeld Mail : Ingo.Macherius@tu-clausthal.de WWW: http://www.tu-clausthal.de/~inim/ Information!=Knowledge!=Wisdom!=Truth!=Beauty!=Love!=Music==BEST (Frank Zappa) xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From sgmlsh at CAM.ORG Mon Mar 24 13:50:05 1997 From: sgmlsh at CAM.ORG (Sam Hunting) Date: Mon Jun 7 16:57:37 2004 Subject: JUMBO In-Reply-To: <3336AFF1.1A23@edu.uni-klu.ac.at> Message-ID: > People didn't seem to be too interested in my idea of an interface for > passing along a complete grove. At least I didn't get too much > feedback. I'm interested -- isn't it true that a grove is the best way to prove (as opposed to asserting, or wishing) that an XML instance really is an SGML subset? Or to show where it is "impure"? (Certainly superior to parsing the instance with one or aonther parser and looking at the errors.) This would be important if a supplier is under a contractual obligation to provide SGML to a buyer, and wishes to provide XML. xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From ebaatz at barbaresco.East.Sun.COM Mon Mar 24 14:59:56 1997 From: ebaatz at barbaresco.East.Sun.COM (Eric Baatz - Sun Microsystems Labs BOS) Date: Mon Jun 7 16:57:37 2004 Subject: Restriction on PI information Message-ID: My application of XML is to markup text that is to be spoken by speech synthesizers. To my naive mind (I'm very new to SGML and XML), a PI seems to be the right construct for passing native information to a speech synthesizers, that is, instructions in their proprietary, already existing, command set. As I don't have any control over the syntax of the commands I want to pass through, I want a PI to allow the widest latitude in the information it can handle. The syntax in the draft doesn't seem to allow that. What is the rationale for the data that a PI allows? What mechanisms can be used to make that data as arbitrary as possible without changing the draft? My take on the PI syntax is that the data needs to avoid looking like the end of a PI. Two different ways of ending a PI (somewhat like the use of double or single quotes for quoted data) would allow a way of getting unpalatable data through (my program would have to generate the appropriate one depending on what my data looked like). Allowing a CDATA section, would also seem to allow quoting of otherwise unpalatable data. Clearly, any changes from the draft would complicate the parsing. Eric Baatz Sun Microsystems Laboratories 2 Elizabeth Drive, MS UCHL03-207 (508) 442-0257 Chelmsford, MA 01824 fax: (508) 250-5067 USA Internet: eric.baatz@east.sun.com xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From Peter at ursus.demon.co.uk Mon Mar 24 16:48:16 1997 From: Peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:57:37 2004 Subject: Lark Message-ID: <4986@ursus.demon.co.uk> Tim, Thanks very much for the latest Lark. I have run it on a medium-sized file 20Kb and a few hundred nodes and it performs fine. Some very minor comments for distribution: (a) It would be really useful to have it as a package - all you have to do is add package lark; at the head of each file. This means that the compiled classes can be located in standard libraries, etc. At present I have /myclasslib/pmr/sgml/*.class /myclasslib/NXP/*.class /myclasslib/NXP/Catalog/*.class etc. and it would be valuable to have /myclasslib/lark/*.class Secondly it means that it's easier to distribute classes in a robust fashion. If there is a clear API then developers can subclass rather than hack the code - this is what I'd like to aim towards myself, so I'm happy to treat lark and NXP as black boxen. So, at the least, this could be done for Lark and Namer. The problem in packages come when: there is some internal that people want to access. This results from an insufficiently developed API there is some complex dependency between classes. If you have A importing B B importing A then something is probably wrong. (It's also difficult to compile unless you do them simultaneously. I have about 10 packages in JUMBO, which took some sorting out. I believe that they have to be arranged as a DAG - I'm sure there is years of theory about this. Wherever I had trouble forcing them into a DAG it revealed itself as a design fault :-) (b) It still doesn't like the valid construction (prod. [32]) it requires ExternalID. [Unfortunately if I create a file like and give it to NXP, NXP insists on *validating it* :-)] ------------------- The unpacked files have ^M at record ends (this isn't a problem for me) and some are missing and EOL and EOF. Again not a problem. Also it might be helpful if the files were packaged under a directory such as V088/ (giving V088/Lark.class) so that when unpacked there was no confusion between versions. P. -- Peter Murray-Rust, domestic net connection Virtual School of Molecular Sciences http://www.vsms.nottingham.ac.uk/ xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From lee at sq.com Mon Mar 24 18:25:46 1997 From: lee at sq.com (lee@sq.com) Date: Mon Jun 7 16:57:37 2004 Subject: JUMBO Message-ID: <9703241825.AA23913@sqrex.sq.com> > > Peter Murray-Rust wrote: > I still don't know if there is a difference between 'GI' and > 'Element type', for example. In XML they are the same, as far as I can tell. The detailed reasoning follows, but you can ignore it if you like.... Lee * An element type can be a generic identifier, a name group, a ranked element or a ranked group. [117; p. 406 of the SGML Handbook] We don't have RANK in XML, so An element type can be a generic identifier or a name group. For example, in SGML defines the content for both boy and girl. This is (I think) not allowed in XML, so in XML there is no practical difference between a GI and an element type. See also the definition of a GI: The idea seems to be that a generic identifier specification is used to give in an instance the type of an element, once the parser has determined that an element is beginning to happen. The terminology seems so obfuscatory to me that I see no benefit to the distinction for SGML itself, let alone for XML, but maybe that is because I lack a legal background :-) If you have difficulty with some of the SGML terminology, also bear in mind that (1) people who have been working with SGML for years also have difficulty with it, (2) some of the WG8 people also seem to have difficulty with it, and (3) I do not believe that there is 100% total agrement on what it means even among the original SGML committee, at least not at a technical nuts-and-bolts level. The only consolation is that HyTime terminology is far, far worse :-) :-) Lee xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From dseibert at sqwest.bc.ca Mon Mar 24 18:54:00 1997 From: dseibert at sqwest.bc.ca (David Seibert) Date: Mon Jun 7 16:57:37 2004 Subject: Restriction on PI information Message-ID: <01BC3841.8F32EBC0@sqruffy.west.sq.com> The simplest alternative is to encode your data (with any encoding that won't produce the character "?"), insert it in a PI, and decode it at the other end. This is probably also the most reliable way to solve this problem. If there were two ways to terminate a PI, what would your aplication do with data that contained both terminators? Regards, David ---------- From: Eric Baatz - Sun Microsystems Labs BOS Sent: Monday, March 24, 1997 6:55 AM To: xml-dev@ic.ac.uk Cc: ebaatz@barbaresco.East.Sun.COM Subject: Restriction on PI information My application of XML is to markup text that is to be spoken by speech synthesizers. To my naive mind (I'm very new to SGML and XML), a PI seems to be the right construct for passing native information to a speech synthesizers, that is, instructions in their proprietary, already existing, command set. As I don't have any control over the syntax of the commands I want to pass through, I want a PI to allow the widest latitude in the information it can handle. The syntax in the draft doesn't seem to allow that. What is the rationale for the data that a PI allows? What mechanisms can be used to make that data as arbitrary as possible without changing the draft? My take on the PI syntax is that the data needs to avoid looking like the end of a PI. Two different ways of ending a PI (somewhat like the use of double or single quotes for quoted data) would allow a way of getting unpalatable data through (my program would have to generate the appropriate one depending on what my data looked like). Allowing a CDATA section, would also seem to allow quoting of otherwise unpalatable data. Clearly, any changes from the draft would complicate the parsing. Eric Baatz Sun Microsystems Laboratories 2 Elizabeth Drive, MS UCHL03-207 (508) 442-0257 Chelmsford, MA 01824 fax: (508) 250-5067 USA Internet: eric.baatz@east.sun.com xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From cgirard at ags.com Mon Mar 24 19:16:21 1997 From: cgirard at ags.com (Girard, Craig) Date: Mon Jun 7 16:57:37 2004 Subject: Beginner Message-ID: Does anyone know a good site to find information on XML for a beginner? Preferably something with tutorials. Craig Girard Electronic Product Technician Automated Graphic Systems 800-678-8760 x512 www.ags.com xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From russc at watfac.org Mon Mar 24 19:33:19 1997 From: russc at watfac.org (Russ Chamberlain) Date: Mon Jun 7 16:57:37 2004 Subject: GI vs. Element Type (Was: RE: JUMBO) Message-ID: <01BC3860.3A99B8E0@watfac16.watfac.org> Hello XMLers, > lee@sq.com wrote: >> > Peter Murray-Rust wrote: >> I still don't know if there is a difference between 'GI' and >> 'Element type', for example. > >In XML they are the same, as far as I can tell. > >The detailed reasoning follows, but you can ignore it if you like.... > >Lee > >* > >An element type can be a generic identifier, a name group, >a ranked element or a ranked group. [117; p. 406 of the SGML Handbook] > >We don't have RANK in XML, so >An element type can be a generic identifier or a name group. > >For example, > >in SGML defines the content for both boy and girl. > >This is (I think) not allowed in XML, so in XML there is no practical >difference between a GI and an element type. Not quite true. You can achieve the identical effect with the following: Here's the verbatim definitions from my ISO 8879 spec: 4.114 element type: A class of elements having similar characteristics; for example, paragraph, chapter, abstract, footnote, or bibliography. 4.145 generic identifier: A name that identifies the element type of an element. 4.146 GI: generic identifier. So, GI <==> generic identifier <==> element type. Or am I missing something (not so) obvious here? So, are boy and girl of the same element type? The definitions imply (I think) that an element type is identified by a unique GI, and vice versa, so it looks to me that there should be no distinction between the two. Thus, boy and girl are of different element types (and have different GIs). Please correct me if I misunderstand. >See also the definition of a GI: (See above) Perhaps there was some previous distinction between the two that is now lost in time? Is this an example of legacy terminology? >The idea seems to be that a generic identifier specification is used >to give in an instance the type of an element, once the parser has >determined that an element is beginning to happen. The terminology >seems so obfuscatory to me that I see no benefit to the distinction for >SGML itself, let alone for XML, but maybe that is because I lack a legal >background :-) I (me and myself) do hereby instantiate my total concurrence with your most perspicacious and eloquent statement regarding obfuscation in SGML. (Agreed ;-) Since when do lawyers write programs? Never! ;-) They have lackeys (us) to do that for them, and if the lackeys can't read the spec, what kind of spec is it? XML, from my perspective, is a minor revolt by the lackeys (and their good friends) to get the spec pared down to something "reasonable". I just hope that the discussions about what is "reasonable" don't lead XML into interminable wrangling. Tim Bray's earlier point about the importance of a (perhaps) imperfect, yet SOLID, specification is well taken. >[. . .Good point about SGML terminology deleted. . .] - Russ xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From Peter at ursus.demon.co.uk Mon Mar 24 20:49:35 1997 From: Peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:57:38 2004 Subject: Beginner Message-ID: <5006@ursus.demon.co.uk> In message "Girard, Craig" writes: > Does anyone know a good site to find information on XML for a beginner? > Preferably something with tutorials. > This is an important question and it's something that a number of xml-dev'ers have addressed in some way. Remember that we all learn in different ways, so what I write may not fit your needs. Firstly it's important for you to assess you needs in the light of what you already know, for example are you: - a programmer? (reading FAQ/xml-dev is as good as any, I suspect) - an informatician, with some acquaintance of SGML? Then you need to know what's different (and simpler). The FAQ covers most of it. - a newcomer to the field of structured information? I can offer some simple tutorials and examples under: http://www.venus.co.uk/omf/cml Although there is a molecular bias, there are several that make general sense. The XML is a simple subset (i.e. no parameter entities, CDATA, marked sections, PIs, etc.) There is also a tutorial on structured documents in general. You may also find some of the SGML material useful, so long as you simply take the principles. There is very little that can't be done in XML, so things like 'A gentle introduction to SGML' (see Robin Cover's page: http://www.sil.org/sgml for this and other introductory material might be useful. The gentle Intro isn't XML, but not far off. if you remember that tags must be balanced and attribute values must be quoted, that's half of what you need to know for simple XML. I would list the following ways of learning :-) For most of these, look at the FAQ for links. - reading the formal specs/BNF. (Yes, this seems unlikely to most of us, but it's the way that a few people prefer.) - reading a book. (There aren't any yet :-( - looking at other people's examples. There are a few referred to from the FAQ. - running parsers (on examples). I find this very useful - trying to develop your own application. You will need the parsers. The uses of SGML (and therefore XML) are as varied as the uses of C. So I suspect we shall get books like: 'Learn XML in 7 days and run a killer website' 'Financial applications in XML' 'XML for scientists and engineers' 'Building XML applications' Finally, do feel free to post to this group. We have all been through this process and *I* created a fair amount of bandwidth on comp.text.sgml when I was learning:-). The community is extremely helpful. It also brings *us* benefits, because we realise what things people are likely to find difficult and how to present our programs and examples. P. -- Peter Murray-Rust, domestic net connection Virtual School of Molecular Sciences http://www.vsms.nottingham.ac.uk/ xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From Peter at ursus.demon.co.uk Mon Mar 24 20:49:40 1997 From: Peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:57:38 2004 Subject: GI vs. Element Type (Was: RE: JUMBO) Message-ID: <5007@ursus.demon.co.uk> These are extremely valuable contributions. I also had another mail which confirmed that there were subtly different (the GI is the _name_ of an element type, c.f. Lewis Carroll). It would be extremely useful for us to collect the required terminology for XML. If someone does it, I'll put it in XML using ISO 12620 terminology (I have already written the DTD and rendering in JUMBO/CML, see http://www.venus.co.uk/vhg/ for examples, and I simply need the content.) Much of the definitions are already in electronic form from - I asked earlier :-), but the important thing is to know which ones are required for XML. It could be a much smaller subset. Good terminology helps the creation of programs and documents, and makes it much easier for newcomers. For example, there is a constant confusion between tags, GIs and elements. A pictorial diagram would be very useful here. I think it's very useful if the components of an API (e.g. Lark uses Element, Entity, etc. are generally agreed to follow the terminology - I am sure that Tim has been careful here). In message <01BC3860.3A99B8E0@watfac16.watfac.org> Russ Chamberlain writes: > Hello XMLers, > > > lee@sq.com wrote: > >> > Peter Murray-Rust wrote: > >> I still don't know if there is a difference between 'GI' and > >> 'Element type', for example. > > > >In XML they are the same, as far as I can tell. [...] > > [...] > Here's the verbatim definitions from my ISO 8879 spec: > > 4.114 element type: A class of elements having similar > characteristics; for example, paragraph, chapter, abstract, > footnote, > or bibliography. H'm. So there can be a hierarchy of element types in an SGML document. > P. -- Peter Murray-Rust, domestic net connection Virtual School of Molecular Sciences http://www.vsms.nottingham.ac.uk/ xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From Peter at ursus.demon.co.uk Mon Mar 24 21:16:00 1997 From: Peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:57:38 2004 Subject: TEI Pointers Message-ID: <5008@ursus.demon.co.uk> In a TEI pointer is a string of the form: FOO (1 DIV2) (3 DIV4) (5 DIV6) identical to: FOO (1 DIV2) FOO (3 DIV4) FOO (5 DIV6) ? In the pointer: FOO (1 BAR BAZ #IMPLIED) do we simply interpret the absence of a BAZ attribute in a BAR element as a match? Without a DTD there is no information as to whether *could* exist as an attribute. P. -- Peter Murray-Rust, domestic net connection Virtual School of Molecular Sciences http://www.vsms.nottingham.ac.uk/ xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From tbray at textuality.com Mon Mar 24 22:31:17 1997 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 16:57:38 2004 Subject: Another way to raid Jon's oven Message-ID: <3.0.32.19970324142932.009c47c0@pop.intergate.bc.ca> Here's another way to get at Jon's Sun data; create a little XML file like so: ---------------------- ]> &SunURL; ----------------------- Then run it through Lark, after doing a lark.processExternalEntities(true); Assuming you've got an Internet connection, Lark will go get it, cheerfully ignoring extensions and mime types and so on; figuring out how to make Lark copy it to output is left as an exercise for the user. - Tim xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From Peter at ursus.demon.co.uk Tue Mar 25 09:08:47 1997 From: Peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:57:38 2004 Subject: Simple API Message-ID: <5040@ursus.demon.co.uk> [announced on comp.text.sgml] Henry Thompson has posted an impressive picture of what a grove looks like: http://www.cogsci.ed.ac.uk/~ht/grove.html It describes the grove for a simple document (2 element types, 2 elements) and it's sufficiently complex that only *part* of it is shown. [I make it clear that I'm impressed by this, but that personally it would take too much effort to implement for the benefit I would get. Many other readers of xml-dev will probably find it's exactly what they want]. ----- It highlights for me that the spectrum of possible approaches to the API is too large to pick an approach that suits everyone. The grove has obviously enormous power if you take the time to learn it but it is not trivial. Henry's diagram is much more reader-friendly than 10179, but confirms that this isn't just a problem of terminology - it's an extra level of complexity. My own suggestion is that we should produce a ReallySimple API independently of the grove approach. I'm sure this won't cause a schism - we need something to test out the language, build simple trees for trying out TEI pointers, etc. IMO most of the things that are bugging us at the moment are not conceptual but - how do we implement this bit of the spec? - how do we read in both Files and URLs (a Java problem) - how do we cater for applets and applications - what structure do we hand over at the end? Can it be subclassed? - how do we get at the DTD? (from a validating parser). - how do we treat parameter enetities :-) P. -- Peter Murray-Rust, domestic net connection Virtual School of Molecular Sciences http://www.vsms.nottingham.ac.uk/ xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From h.rzepa at ic.ac.uk Tue Mar 25 10:29:07 1997 From: h.rzepa at ic.ac.uk (Rzepa, Henry) Date: Mon Jun 7 16:57:38 2004 Subject: XML list now searchable Message-ID: The XML archives; http://www.lists.ic.ac.uk/hypermail/xml-dev/ have now been indexed using WAIS and are searchable. Dr Henry Rzepa, Dept. Chemistry, Imperial College, LONDON SW7 2AY; rzepa@ic.ac.uk; Tel (44) 171 594 5774; Fax: (44) 171 594 5804. URL: http://www.ch.ic.ac.uk/rzepa/ xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From ht at cogsci.ed.ac.uk Tue Mar 25 15:57:25 1997 From: ht at cogsci.ed.ac.uk (Henry S. Thompson) Date: Mon Jun 7 16:57:38 2004 Subject: Simple API In-Reply-To: Peter@ursus.demon.co.uk's message of Tue, 25 Mar 1997 09:52:48 GMT References: <5040@ursus.demon.co.uk> Message-ID: <529.199703251557@grogan.cogsci.ed.ac.uk> Peter Murray-Rust wrote complementing my picture of a grove fragment (thanks!) but suggesting that it demonstrated that the grove concept was too complex to serve as the basis for a minimal XML API. I suspect the complexity is more apparent (i.e. in the graphics) than real, stemming from my pedagogically directed efforts to exemplify nearly everything in a very constrained space. A pretty simple set of structures and access functions would encapsulate almost all of the core property set modules. We are currently moving our existing LT NSL tools (see http://www.ltg.ed.ac.uk/software/nsl/) to support XML, using the existing API, which was developed 'pre-grove', but covers most of the necessary information. Watch this space . . . ht xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From Peter at ursus.demon.co.uk Tue Mar 25 17:15:35 1997 From: Peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:57:38 2004 Subject: Simple API Message-ID: <5054@ursus.demon.co.uk> In message <529.199703251557@grogan.cogsci.ed.ac.uk> "Henry S. Thompson" writes: > Peter Murray-Rust wrote complementing my picture of a grove fragment > (thanks!) but suggesting that it demonstrated that the grove concept > was too complex to serve as the basis for a minimal XML API. It actually reminds me of mangroves :-) I am a very geometrical thinker and so I appreciated the picture. I would applaud any other efforts to represent things in diagrammatic form - HenryT did a very useful diagram of 'pointers' for the WG. The diagram lets me feel my way towards the solution (whereas some people are capable of abstract thought). Diagrams like this are useful for the heavy demand we shall get for educational material. Eliot Kimber has just written quite a lot on groves on c.t.s, which might be helpful. P. -- Peter Murray-Rust, domestic net connection Virtual School of Molecular Sciences http://www.vsms.nottingham.ac.uk/ xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From Peter at ursus.demon.co.uk Tue Mar 25 17:36:29 1997 From: Peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:57:38 2004 Subject: XML list now searchable Message-ID: <5058@ursus.demon.co.uk> In message "Rzepa, Henry" writes: > The XML archives; > > http://www.lists.ic.ac.uk/hypermail/xml-dev/ > > have now been indexed using WAIS and are searchable. > Many thanks Henry, xml-dev is of great value to the XML community. With new members continuing to subscribe and this will further preserve it as a permanent resource. P. -- Peter Murray-Rust, domestic net connection Virtual School of Molecular Sciences http://www.vsms.nottingham.ac.uk/ xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From jjc at jclark.com Tue Mar 25 19:30:04 1997 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 16:57:38 2004 Subject: Simple API Message-ID: <2.2.32.19970325115530.00bd0a0c@jclark.com> At 09:52 25/03/97 GMT, Peter Murray-Rust wrote: >My own suggestion is that we should produce a ReallySimple API independently >of the grove approach. It's perfectly possible to have a "ReallySimple API" that is based on groves, for example: public interface Builder { SgmlDocument build(String url); } public interface Node { public abstract Node getParent(); public abstract NodeList getChildren(); } public interface NodeList { public abstract Node getItem(int i); public abstract int getCount(); } public interface NamedNodeList { public abstract Node getItem(String name); public abstract NodeList toNodeList(); } public interface SgmlDocument extends Node { public abstract NodeList getProlog(); public abstract NodeList getEpilog(); public abstract Element getDocumentElement(); public abstract NamedNodeList getElements(); public abstract NamedNodeList getEntities(); } public interface Element extends Node { public abstract String getId(); public abstract String getGi(); public abstract NodeList getContent(); public abstract NamedNodeList getAttributes(); public abstract boolean getMustOmitEndTag(); } public interface DataChar extends Node { public abstract char getChar(); } public interface Pi extends Node { public abstract String getSystemData(); } public interface ExternalData extends Node { public abstract Entity getEntity(); } public interface AttributeAssignment extends Node { public abstract NodeList getValue(); public abstract boolean getImplied(); public abstract String getName(); } public interface AttributeValueToken extends Node { public abstract String getToken(); public abstract Element getReferent(); public abstract Entity getEntity(); public abstract Notation getNotation(); } public interface Entity extends Node { public abstract String getName(); public abstract ExternalId getExternalId(); public abstract String getText(); public abstract Notation getNotation(); } public interface ExternalId extends Node { public abstract String getSystemId(); public abstract String getPublicId(); } public interface Notation extends Node { public abstract String getName(); public abstract ExternalId getExternalId(); } Is that really so complicated? James xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From Peter at ursus.demon.co.uk Tue Mar 25 20:19:22 1997 From: Peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:57:38 2004 Subject: Simple API Message-ID: <5067@ursus.demon.co.uk> In message <2.2.32.19970325115530.00bd0a0c@jclark.com> James Clark writes: James, Thanks very much for taking the time to list this out. As you imply, most of the concepts map directly onto 'common' SGML terminology. This is a valuable starting point for people who are developing simple systems. [... API deleted...] > > Is that really so complicated? Not when it's presented like this :-). The important thing for all of us is to make sure that the terminology between different approaches is as compatible as possible. P. -- Peter Murray-Rust, domestic net connection Virtual School of Molecular Sciences http://www.vsms.nottingham.ac.uk/ xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From Peter at ursus.demon.co.uk Tue Mar 25 23:03:17 1997 From: Peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:57:38 2004 Subject: TEI Pointers Message-ID: <5068@ursus.demon.co.uk> I have implemented a first pass at TEI pointers in JUMBO and would be grateful for any checked examples of the results of applying these. I am not quite sure where the discussions are at on the WG, and have so far managed: ROOT [HERE] ID based on attribute *name*, not type CHILD DESCENDANT ANCESTOR PREVIOUS NEXT PRECEDING FOLLOWING All these return Elements. I have left placeholders for SPACE and FOREIGN since it is not clear what they return. I can't remember what the groundswell of opinion was for PATTERN and in any case its syntax is not in the draft. Does it use a regex? If so, what? (BTW, is there a typo in the draft? A1.1.1.6 (CHILD), after the box for 'element', line 3 refers to 'fourth and fifth' and I would think this was 'fifth and sixth') P. -- Peter Murray-Rust, domestic net connection Virtual School of Molecular Sciences http://www.vsms.nottingham.ac.uk/ xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From bosak at atlantic-83.Eng.Sun.COM Sat Mar 29 01:23:59 1997 From: bosak at atlantic-83.Eng.Sun.COM (Jon Bosak) Date: Mon Jun 7 16:57:38 2004 Subject: Dev Day demos Message-ID: <199703290122.RAA01967@boethius.eng.sun.com> Here (in no particular order) is the list of demos I have lined up for the implementor's session in the XML track on Developer's Day at the World Wide Web conference (April 11, 1997) in Santa Clara. Please let me know immediately if anything has occurred to prevent your appearance. I will be sending further details by direct mail this weekend. ArborText XML editor Grif XML editor Inso XML converter, Web server, and local browser Open Molecule Fndtn. XML processor/renderer Sun Microsystems XML Web site ICL XML server Fujitsu Laboratories XML/DSSSL browser Tim Bray XML parser Norbert Mikula XML parser, DSSSL engine RivCom XML Netscape plug-in Univ. of Edinburgh XML tools, DSSSL syntax checker Kevin Grimes XML processor ---------------------------------------------------------------------- Jon Bosak, Online Information Technology Architect, Sun Microsystems ---------------------------------------------------------------------- 2550 Garcia Ave., MPK17-101, Mountain View, California 94043 Davenport Group::SGML Open::NCITS V1::ISO/IEC JTC1/SC18/WG8::W3C XML If a man look sharply and attentively, he shall see Fortune; for though she be blind, yet she is not invisible. -- Francis Bacon ---------------------------------------------------------------------- xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From ser at javalab.uoregon.edu Sun Mar 30 07:43:59 1997 From: ser at javalab.uoregon.edu (Sean Russell) Date: Mon Jun 7 16:57:38 2004 Subject: Entity replacement Message-ID: <199703300546.VAA18376@javalab.uoregon.edu> I was told this was a hotly debated topic, and I was wondering what the current status was. This is regarding section 4.3 of the XML working draft, 14-Nov-96. As regards #2, #3, and #6, which claim that internal entities should be processed and replaced by their values by the parser before the data is returned to the application, will this requirement be changed? --- SER xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From Peter at ursus.demon.co.uk Sun Mar 30 13:12:19 1997 From: Peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:57:38 2004 Subject: Entity replacement Message-ID: <5246@ursus.demon.co.uk> In message <199703300546.VAA18376@javalab.uoregon.edu> Sean Russell writes: > I was told this was a hotly debated topic, and I was wondering what the cur Probably by me :-), so I'll try to answer. > rent status was. This is regarding section 4.3 of the XML working draft, 1 > 4-Nov-96. > > As regards #2, #3, and #6, which claim that internal entities should be pro > cessed and replaced by their values by the parser before the data is return > ed to the application, will this requirement be changed? The problem as I see it is not that anything requires change, but rather clarification. The main problem seems to be with parameter entities. The sort of problem that *I* don't know the answer to is whether a parameter entity in a comment is expanded or whether a comment in a parameter entity is expanded and in which order. Another is that PEs can be nested something like: "> but I doubt if I have got this right (I've deleted the WG discussion). I also know from experience that *authoring* PEs can be quite tricky (I used this at one stage to mimic directory names in resolving entities and you have to get the quotations just right). It's therefore even more important to get the parsing right :-) The Editorial Review Board has promised enlightenment in the nearish future. If your are really keen on this have a play with a full sgml parser such as nsgmls and see how PEs are treated there. P. > > --- SER > > xml-dev: A list for W3C XML Developers > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ > To unsubscribe, send to majordomo@ic.ac.uk the following message; > unsubscribe xml-dev > List coordinator, Henry Rzepa (rzepa@ic.ac.uk) > > -- Peter Murray-Rust, domestic net connection Virtual School of Molecular Sciences http://www.vsms.nottingham.ac.uk/ xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From Peter at ursus.demon.co.uk Sun Mar 30 18:39:31 1997 From: Peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:57:38 2004 Subject: XML-LINK Message-ID: <5250@ursus.demon.co.uk> I am trying to think about how to (a) write and (b) process documents that use the XML-LINK syntax. (For those who aren't up-to-date with the WG's discussions, LINK is being discussed at this moment. Some parts of it seem to have coalesced (i.e. not much visible discussion) and I'd like some clarification on some points in the draft - I didn't manage to understand all the discussion.) I am not seeking to re-open things which have been decided, but rather to know how it should be used. I realise that some of the discussion postdates the doc I am working from: (WD-xml-link-970305) at: http://www.textuality.com/sgml-erb/WD-xml-link.html. I'd be very grateful for comments on the following - I'll refer to sections. These are my understandings - please correct them :-) 2.1 The preferred mechanism is to have multiple attributes for elements which carry the XML-LINK attribute. This is currently illegal in SGML, although this is likely to change and XML is rather hoping this will be soon. So a DTD might have an A element (similar to HTML): and later To get round the illegality it would be allowed (though messy) to combine these into a single ATTLIST. ***If SGML is not revised, the *parser* would have to process multiple attlists***. The documents before parsing would not be valid SGML. For a well-formed document the link attributes may have to be inserted in the document. No changes are necessary for the parser. NOTE: If XML-LINK is added from the DTD, all XML-LINK values are identical ("#FIXED"). If they are added within the document, the *could* have different values and the parser would not complain, but this seems to break the spirit of the draft. If ATTLIST A XML-LINK is provided in the DTD, then any attributes in the document must be #FIXED (?), and so are redundant. If they do not agree it's an error (even in a WF document?). Table 3.2 I have difficulty in understanding this, especially the very similar terms LINK, XML-LINK, XLINK and XML-XLINK. My understanding is this: The table does not (although it appears to) define an XML-XLINK element. My understanding is that 'XML-XLINK' is a generic variable replaceable by 'FOO' or whatver for as many elements as the DTD author requires. So in the above example, 'XML-LINK' would be replaced by 'A'. (I assume the same for XML-LOCATOR, XML-LINK, XLG and XLD). (The five tables in 3.2, 3.3, 6.1 correspond to the five allowed values of XLINK, which must be #FIXED for each). There is a different number of attributes for each of the five types (given in the tables). If an attribute occurs in more than one of the five it always has the same form apart from XML-LINK. Elements with the attribute XML-LINK="XLINK" have a content model which can only include #PCDATA or 'XML-LOCATOR'. Since 'XML-LOCATOR' is determined by its XML-LINK attribute value and not by its GI a normal SGML parser cannot detect this. [A similar argument holds for elements with XML-LINK="XLG"]. ***The *?parser?* will have to determine whether elements with attribute XML-LINK="XLINK" contain only elements with attribute XML-LINK="LOCATOR" (or #PCDATA). This presumably has to hold for well-formed documents without an internal DTD, but with explicit attribute values. Or does the parser simply look for well formedness and leave this slightly hairy problem to the application/link_processor?*** If no ENTITY is defined for FOO, and appears in the document, what happens? Is the parser or application required to detect this as a reserved attribute ***and fill in all the others in the draft for that XML-LINK type?*** So, assuming this is on the right lines, there are three uses of XML-LINK: (a)LINK by itself (b)XLINK/LOCATOR working together (c)XLG/XLD working together. I presume the syntax looks something like: (a) (Assume element A as above):

This is the Home of the Elephant house and the and similarly for %LOCATOR-attribs in the document:

In the we can find and and some monkeys tomorrow

If I am anywhere near right, this means that: The text (#PCDATA) will be displayed along with an image of Nellie and a button (application-defined) to JUMBO and the monkeys. When JUMBO-button is pressed, then an additinal window (the Jumbo browser?) is launched. When the monkey buton is pressed the current window disappears to be replaced by gibbering. Presumably the application decides whether the Jumbo browser is killed at this stage. [I am not clear what the SHOW/ACTUATE, etc. do for the XML-XLINK container. Presumably the contents could be hidden until a button was pressed? In the example, the whole contents of ZOO would be INCLUDEd in the Paragraph?] (c) is a list of documents and presumably straightforward? It would be useful to have comments and other examples for XML-LINK as it may impact on XML-LANG. For example, a DTD should not be designed with attributes such as ROLE, TITLE, SHOW if it is likely to be used for XML-LINK at a later stage - perhaps this should appear in the draft? P. -- Peter Murray-Rust, domestic net connection Virtual School of Molecular Sciences http://www.vsms.nottingham.ac.uk/ xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) From dseibert at sqwest.bc.ca Mon Mar 31 19:52:48 1997 From: dseibert at sqwest.bc.ca (David Seibert) Date: Mon Jun 7 16:57:38 2004 Subject: Entity replacement Message-ID: <01BC3DB9.21A3AC70@sqruffy.west.sq.com> 1) PEs aren't supposed to be recognized inside comments (nothing is except the terminal '*-->'), so they aren't supposed to be expanded. This is also true for entities in cdata. 2) Expansion of comments inside parameter entities shouldn't matter, since comments can go anywhere that PEs can. The exact treatment could probably depend on the application handling the document. David ---------- From: Peter Murray-Rust Sent: Sunday, March 30, 1997 3:55 AM To: xml-dev@ic.ac.uk Subject: Re: Entity replacement In message <199703300546.VAA18376@javalab.uoregon.edu> Sean Russell writes: > I was told this was a hotly debated topic, and I was wondering what the cur Probably by me :-), so I'll try to answer. > rent status was. This is regarding section 4.3 of the XML working draft, 1 > 4-Nov-96. > > As regards #2, #3, and #6, which claim that internal entities should be pro > cessed and replaced by their values by the parser before the data is return > ed to the application, will this requirement be changed? The problem as I see it is not that anything requires change, but rather clarification. The main problem seems to be with parameter entities. The sort of problem that *I* don't know the answer to is whether a parameter entity in a comment is expanded or whether a comment in a parameter entity is expanded and in which order. Another is that PEs can be nested something like: "> but I doubt if I have got this right (I've deleted the WG discussion). I also know from experience that *authoring* PEs can be quite tricky (I used this at one stage to mimic directory names in resolving entities and you have to get the quotations just right). It's therefore even more important to get the parsing right :-) The Editorial Review Board has promised enlightenment in the nearish future. If your are really keen on this have a play with a full sgml parser such as nsgmls and see how PEs are treated there. P. > > --- SER > > xml-dev: A list for W3C XML Developers > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ > To unsubscribe, send to majordomo@ic.ac.uk the following message; > unsubscribe xml-dev > List coordinator, Henry Rzepa (rzepa@ic.ac.uk) > > -- Peter Murray-Rust, domestic net connection Virtual School of Molecular Sciences http://www.vsms.nottingham.ac.uk/ xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@ic.ac.uk the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@ic.ac.uk)