From Peter at ursus.demon.co.uk Sat Mar 1 00:14:42 1997
From: Peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun 7 16:57:26 2004
Subject: Suggested List Protocol
Message-ID: <4086@ursus.demon.co.uk>
XML-DEV has been going for less than a week and it already has ~100
subscribers and about the same number of postings. Some of the
suscribers are very well known in/to the SGML community but I expect
there are some who are new to this whole venture and here a a few
thoughts that may be helpful:
The list is unmoderated and has no fixed agenda, so that you shouldn't
be afraid of bringing up your own ideas or questions, so long as they are
in some way related to how XML will be implemented. The list has no formal
standing and no way of 'reaching decisions' (though it's possible that
mechanisms might emerge). Any voluntary offers for summarising threads
will, I'm sure, be most valuable (e.g. 'We seem to agree X, but we differ
on Y - so there seems to be a role for software that is limited to X? Is
this realistic?').
The discussions run alongside the WG discussions and there is a considerable
overlap in membership. XML-DEV is _not_ an informal arena for discussing
matters still on the WG agenda. If an issue is aired here ("do I _really_
have to do X and Y to achieve Z?") that might have a bearing on the draft(s)
it won't go unnoticed :-). Discussions are archived so that you can
download and read them every 2 weeks.
SGML is mainly thought of as a document processing and (paper) rendering
tool, but XML has the potential for many completely new applications
(my own is molecular science). Therefore if you think that XML might
help in flying planes, making money, sending digital odors or holography
over the Internet (suggested by MIME) feel free to raise the topic.
Be considerate about the volume of a posting - some of us pay for incoming
mail :-). Quote those parts of previous replies that relate to your message.
If you have large chunks of code, post them on http: or ftp: resources.
Please also post everything in human-readable ASCII as a lot of
people may not be able to manage compressed or other transformed files.
Hopefully we shall move to other character sets (e.g. Unicode) in the future.
P.
--
Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences
http://www.vsms.nottingham.ac.uk/
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From Peter at ursus.demon.co.uk Sat Mar 1 00:14:52 1997
From: Peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun 7 16:57:26 2004
Subject: TEI pointers
Message-ID: <4087@ursus.demon.co.uk>
I need a search tool for structured documents and would be
grateful for pointers to existing tools which are free and re-usable. My
target language is Java. I would intend to use the TEI syntax (does it
come in different flavours?).
I would also intend to use a graphically-based query if possible as well
as a commandline. Has this been tried and are there any metaphors which
have proved to be useful? How do most humans currently construct TEI
quries? Do they learn the language and use a command line or do they
get customised queries?
This is sufficiently important for me that I shall need to do it myself
if there is no alternative, but it seems like something that can be developed
as a problem-independent module, so long as the API from the parser of
other tools (e.g. GUIs) is clear. The search needs to have the flexibility
to include FOREIGN, i.e. the ability to include non-XML-based
methods. (In my own case it would be molecular substructure searches, which
are essentially labelled subgraph matching algorithms). It should also
include the SPACE facility, because this is going to be extremely
important in technical documents.
[The WG has suggested that the TEI syntax may be an important part of XML
PhaseII, but I am not sure of the timescale for resolution. My
request would be currently useful for documents prepared for the PhaseI
draft and doesn't prejudge the WG deliberations.]
P.
--
Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences
http://www.vsms.nottingham.ac.uk/
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From Peter at ursus.demon.co.uk Sat Mar 1 01:08:02 1997
From: Peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun 7 16:57:26 2004
Subject: Simple approaches to XML implementation
Message-ID: <4098@ursus.demon.co.uk>
The discussion on the API is extremely valuable and exciting and I'm learning
a lot. There is no doubt that there are enough experts to do a first class
job of building an API that will last. However, for some people who may
have joined this list and who really need or want XML it may not be clear how
some of this relates to more practical problems. (It really does!).
A few weeks ago I got assurance from the WG that XML was not only for
rocket_scientists, so if you aren't one here is a place to talk about
the simple aspects. Remember that XML is 'an extremely simple' dialect of
SGML _and can be used as such_. I started working with an XML-like
dialect about 12 months ago, wrote my own parser and postprocessor with
steam technology so it's not _essential_ to have groves, IDL, etc. though
it will certainly make it much easier to develop complex applications. You
may also want to build a prototype to learn what's it's about and then
bolt in the more powerful parsing and processing tools later.
The first thing to realise is that XML allows you to create documents that
are well-formed, but need not be validated. That may be fine for
many people - especially during a development stage. If you don't use
EMPTY elements (e.g.
in HTML) so that all your start- and end-tags are
balanced and nested correctly, and if your attributes are quoted, then
that is all you need for a WF document. Example:
This is a string
So, are there simple tools for creating well-formed documents? Can HTML
editors be extended? (Since I create a lot of my XML documents by hand,
I'd be interested to have shortcuts).
------------------------------------------------------------------------
Most documents will then need some sort of processing. There are two
main strategies:
- event stream mode.
- parse tree
The event stream mode is best illustrated by HTML and the font or phrase
tags. switches on italics and switches it off. is bold_on
and is bold_off. If your XML document was arranged as above it would
be quite easy to write code which read each line, and took appropriate
action (Foo_on, Foo_off).
I've been writing something this morning to do exactly that for HTML. I use
Java, but there's nothing fundamental about what language you use (a year
ago I used tcl/tk with CoST). So, for example, I take a _stream_ of HTML,
write it to the screen, and every time I encounter a flag (tag) I take
appropriate action. If the document is well-formed, the tags should nest
so that the interpreting/parsing process must throw an error if an end-tag
is encountered unexpectedly.
The tree model is best illustrated by the containers in HTML:
This is a title
That's all folks
If you look at what elements contain what others, you'll see that HTML
can be thought of as a root, with two bracnches to its children
(HEAD and BODY). HEAD has one child (TITLE) and BODY has one child (H1).
Both TITLE and H1 contain strings (#PCDATA) which can be regarded as children
Looking at structured documents as trees is extrememly powerful for searching
and other manipulations. IMO HTML requires both approaches and in processing
it you have to switch between them.
--------------------------------------------------------------------------
In building a generic parser (such as Lark and NXP) the authors have to cover
the whole range of possibilities both in the input document and the ways
that it might be processed. There is, however, no need for any particular
application to use the full power of XML and this might allow you to develop
a simpler parser and/ or editor if you want, especially if you have
need to write it for a specific platform, etc. Also, if you just 'want to
get started' there are enough tools to get a feel for what XML is about.
P.
XML is committed to making things simple!
--
Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences
http://www.vsms.nottingham.ac.uk/
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From Ingo.Macherius at tu-clausthal.de Sat Mar 1 02:16:59 1997
From: Ingo.Macherius at tu-clausthal.de (Ingo Macherius)
Date: Mon Jun 7 16:57:26 2004
Subject: Simple approaches to XML implementation
In-Reply-To: <4098@ursus.demon.co.uk> from "Peter Murray-Rust" at Mar 1, 97 00:28:17 am
Message-ID: <199703010216.DAA00533@florix.rz.tu-clausthal.de>
> Most documents will then need some sort of processing. There are two
> main strategies:
> - event stream mode.
> - parse tree
I have made up a perl5 module which models a very simple forest-like strukture,
that holds Perl5 objects. The objects are created by reading nsgmls' ESIS
and putting anything between certain named tags into a hash, which
basically is the object content. The objects can be inserted as a root or into
another object, which yields a forest-like structure.
The tree-relations between objects are stored outside in a libdbm database,
one per tree. It holds three tables,
- id -> hashed data
- id -> id of father object, or NULL
- id -> ids of all sons
Obviously any object must have a method giving a unique id within the forest.
I think this may be called a poor-mans-grove :) I made up a simple API:
INSERT INTO DB ( when opened MODE 'write' )
$db->insert_as_root ( $root );
$db->insert ( $child, 'root.id' );
$db->update ( $the_resource );
QUERY THE DB
BASIC FUNCTIONS
$resource = $db->fetch ( 'root.id' );
$father = $db->father ( 'child.id' );
@sons = $db->sons ( 'root.id' );
@roots = $db->roots;
DERIVED FUNCTIONS (recursing all nodes below given @ids)
@sons = $db->all_container_sons ( @ids );
@sons = $db->all_leaf_sons ( @ids );
@sons = $db->all_sons ( @ids );
@fathers = $db->all_fathers ( @ids );
DESTROY DB CONTENT ( when opened MODE 'write' )
$db->reset;
I found this sufficient to solve small problems for which ESIS is not enough
and a grove is overkill. I must admit, albeit I read most of ISO 10179, I
really didn`t get the details. But what I found valuable is the choice
between navigating (father/son) and id-based lookups (fetch).
++im
--
Snail : Ingo Macherius // L'Aigler Platz 4 // D-38678 Clausthal-Zellerfeld
Mail : Ingo.Macherius@tu-clausthal.de WWW: http://www.tu-clausthal.de/~inim/
Information!=Knowledge!=Wisdom!=Truth!=Beauty!=Love!=Music==BEST (Frank Zappa)
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From cbullard at hiwaay.net Sat Mar 1 05:05:09 1997
From: cbullard at hiwaay.net (len bullard)
Date: Mon Jun 7 16:57:26 2004
Subject: Simple approaches to XML implementation
References: <4098@ursus.demon.co.uk>
Message-ID: <3317B8FF.554E@hiwaay.net>
Peter Murray-Rust wrote:
>
> So, are there simple tools for creating well-formed documents? Can HTML
> editors be extended? (Since I create a lot of my XML documents by hand,
> I'd be interested to have shortcuts).
Where it is well-formed, XML is very amenable to macros which
ANY word processor system has these days. Just having end
tags makes it easy to write editing tools in, for example,
Word using the dialog editor and hidden text. Klugy, perhaps,
but not out of reach and the formatting is free.
> Most documents will then need some sort of processing.
Sure. Does it have to be event streams? While more powerful,
even cheap macros can do a lot. The idea here is, while XML is
good for the Internet, simplified SGML is good for just about
any thing where content markup is preferred over encapsulated
objects or compiled structures. Just removing minimization,
as you point out, allows for some clever work to be done
with very cheap tools. Cheap tools are where the gains begin.
> I've been writing something this morning to do exactly that for HTML. I use
> Java, but there's nothing fundamental about what language you use (a year
> ago I used tcl/tk with CoST). So, for example, I take a _stream_ of HTML,
> write it to the screen, and every time I encounter a flag (tag) I take
> appropriate action. If the document is well-formed, the tags should nest
> so that the interpreting/parsing process must throw an error if an end-tag
> is encountered unexpectedly.
>
> XML is committed to making things simple!
XML has made SGML simple. It can even be simpler than that.
len
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From richard at light.demon.co.uk Sat Mar 1 09:59:59 1997
From: richard at light.demon.co.uk (Richard Light)
Date: Mon Jun 7 16:57:26 2004
Subject: XML API specification
In-Reply-To: <331737EA.3A4E@hiwaay.net>
Message-ID:
As a postscript to my comments on the API design, I think it is
important that the "event vs. tree" discussion shouldn't muddy the
waters when looking at the API design. If the XML processor is seen as
a 'server', and the application as a 'client', with the API in between,
it is clear to me that:
- the server's job is to return information about XML documents
requested by the client (and this is its _only_ job!);
- to do this job, the server _must_ parse the document (fully or
partially) in the time-honoured sequential manner;
- the client isn't so constrained, and can ask for as little or as much
as it likes.
For example, in an online browsing application, it is a likely
requirement that the client, in resolving a TEI extended pointer or
HyTime-like XML link, will request a specific element out of an XML
document. Having retrieved that element, the browser may have no
further use for the XML document from which it came. So the server
needs to parse through the document until it hits the required element,
then it can stop. Parsing through the rest of the document would just
be a waste of time. (Conversely, it makes sense for the server to hold
the results of the parse until it knows the client has no further use
for that document. It should also be able to pick up the parse where it
left off if necessary.)
I don't think that it should really be the client's job to tell the
server _how_ to do its parsing, except at the formal level (i.e. 'well-
formed' or 'valid').
>xml-dev: A list for W3C XML Developers
>Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
>To unsubscribe, send to majordomo@ic.ac.uk the following message;
>unsubscribe xml-dev
>List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
>
Richard Light
SGML and Museum Information Consultancy
richard@light.demon.co.uk
3 Midfields Walk
Burgess Hill
West Sussex RH15 8JA
U.K.
tel. (44) 1444 232067
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From fcha at Berger-Levrault.fr Sat Mar 1 11:10:21 1997
From: fcha at Berger-Levrault.fr (F. Chahuneau - General Manager)
Date: Mon Jun 7 16:57:26 2004
Subject: Fw: Trees versus event streams
Message-ID: <199703011109.MAA15492@cygne.ais.berger-levrault.fr>
[Peter Murray-Rust (Peter@ursus.demon.co.uk), Thu, 27 Feb 1997]
> My current problem may highlight this. A CML document is highly
> tree-structured and contains no mixed content, so that eventStreams don=
't
> contribute much. BUT it also includes chunks of HTML where a tree
> structure is quite inappropriate. If I take a Lark-based approach (or m=
y
> own parser) the HTML gets rendered into a tree. I am now hacking this
> back into an event stream to render the hypertext. Not only does it
> take more effort, but I'm sure that holding HTML as a tree has a
> memory hit. Ideally when I'm parsing CML, and come to the
> tag (sic) which contains , I'd like to tell the parser
> 'stop parsing as a tree and just hold a hypertext string until =
Peter,
This kind of consideration is precisely what led us to define a *dual*
programming paradigm when designing the Balise SGML processing language
(http://www.balise.com).
Being able to switch back-and-forth between these two useful and
complementary abstractions for an SGML document (a "tree of typed nodes
with attributes" vs an "ESIS or ESIS+ event stream") is, from our
experience, often required when you have to express complex processing
tasks on SGML documents, but still want to keep your code as concise as
possible.
No paradigm is inherently better than the other: it all depends what you =
want to express. If you want your code to remain legible and maintainable=
(i.e related in a straightforward way to the processing idea it expresses=
),
then you really need both in some cases. If you are interested, this idea=
is further developed in the following paper: "Event Driven or Tree
Manipulation Approaches to SGML Transformation" presented at SGML'96 and =
available at "http://www.balise.com/current/articles/lecluse.htm"
> We *could* do this with a PI, but would have to all agree.
Doing this with a PI does not seem to be the best idea, since it does not=
leave a choice to the application programmer which mode she wants to use =
for what, while the best choice may entirely depend on what she wants to=
do. Being able to switch betwwen tree an event-stream mode on any GI even=
t
is what is required. For maximum generality, you also need to be able to =
generate an event stream during (sub-)tree traversal, maybe not the
*original* tree, but one which you have modified or created through your =
application.
In the world of traditional SGML applications, sheer document size is
frequently an issue, so that tree mode must often be used with parsimony.=
In the case of XML or HTML fragments, this problem is probably negligible=
.
The rationale for maintaining an "event stream" paradigm in an XML API is=
,
therefore, not to save memory, but simply that it might the most
appropriate in some cases.
_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/
_/ Fran=E7ois CHAHUNEAU phone: [+33] 1 40 64 43 00=
_/
_/ Directeur G=E9n=E9ral/General Manager =
_/
_/ AIS S.A. FAX: [+33] 1 40 64 43 10 _/=
_/ 15-17 rue R=E9my Dumoncel email: fcha@ais.berger-levrault.fr _=
/
_/ 75014, Paris, FRANCE WWW: http://www.berger-levrault.fr _/
_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From Peter at ursus.demon.co.uk Sat Mar 1 11:16:17 1997
From: Peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun 7 16:57:26 2004
Subject: Simple approaches to XML implementation
Message-ID: <4111@ursus.demon.co.uk>
[from PeterMR]
>
> Thanks Ingo,
> This is very useful, because it shows that a great deal can be done quite
> simply.
>
> In message <199703010216.DAA00533@florix.rz.tu-clausthal.de> Ingo Macherius writes:
> [...]
> > I have made up a perl5 module which models a very simple forest-like strukture,
> > that holds Perl5 objects. The objects are created by reading nsgmls' ESIS
>
> I believe that ESIS has potentially a useful role in producing XML documents
> from SGML documents - this was certainly my own strategy until recently.
> ESIS is the normalised output from a parser (especially sgmls or NSGMLS from
> James Clark - these are freely available.) It's trivial to transform
> ESIS into XML, but not the other way round, since XML is richer.
>
> ESIS doesn't retain everything from the original document(s) and I've been
> asking the experts what gets lost. My rough summary is that XML->ESIS
> loses:
> - comments (this matters if you want to edit the document or have
> it read by humans. However comments should not be used
> by machines - simply passed through)
> - entities. If your document includes entities such as &chapter1;
> these may be expanded and replaced by their contents. In
> this way some of the structure may be less clear
> - conditional markup. If you use INCLUDE and/or IGNORE then the
> IGNORE'd sections won't come through and the INCLUDE'd
> ones won't be marked as such
> [I think that processing instructions come through OK? And that you can
> determine whether an attribute value was defaulted or not?]
>
> If you use this simple level of markup (and _I_ do for molecular science)
> then XML WF documents are equivalent to ESIS output from sgmls or nsgmls.
> [Query: Are there plans for nsgmls/sgmls to output XML as an alternative
> to ESIS? I expect it's straightforward].
>
>
> > and putting anything between certain named tags into a hash, which
> > basically is the object content. The objects can be inserted as a root or into
> > another object, which yields a forest-like structure.
> > The tree-relations between objects are stored outside in a libdbm database,
> > one per tree. It holds three tables,
> > - id -> hashed data
> > - id -> id of father object, or NULL
> > - id -> ids of all sons
> > Obviously any object must have a method giving a unique id within the forest.
> > I think this may be called a poor-mans-grove :) I made up a simple API:
> ^^^^^^^^^^^^^^^
> It's still very powerful, and you have recognised the importance of
> structured documents. The good news is that this will all be addressed
> (literally and metaphorically) in the discussion of addressing within
> XML documents. The TEI project has developed a pointer scheme which
> covers most aspects of structure and extends the metaphor to descendants,
> ancestors, siblings and navigation by attributes and their values. I
> am expecting one or more 'black boxes' to be developed which support this,
> so that you don't have to write perl scripts any more. I'm waiting to hear
> from another thread :-)
>
> [... code deleted ...]
> >
> > I found this sufficient to solve small problems for which ESIS is not enough
> ^^^^^^^^^^
> I think you were operating _on_ the ESIS stream. You mean that simple
> 'grep' or other tools weren't powerful enough?
>
> > and a grove is overkill. I must admit, albeit I read most of ISO 10179, I
> ^^^^^^^^^^^^^^^^^
> This is one of the points at issue. Is it going to be possible to produce
> software quickly, and easy enough to read and use. I'm waiting to find out:-)
>
> > really didn`t get the details. But what I found valuable is the choice
> ^^^^^^^^^^^
> I think it's very important not to be frightened by 10179. What you have
> done is very similar to what I and many others have done - devising
> home-grown tools for searching structured documents. 10179 has an
> implementation in Scheme (am I right?) but not in more procedural or
> object-oriented languages.
>
> > between navigating (father/son) and id-based lookups (fetch).
>
> [...]
> P.
>
--
Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences
http://www.vsms.nottingham.ac.uk/
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From fcha at Berger-Levrault.fr Sat Mar 1 18:20:36 1997
From: fcha at Berger-Levrault.fr (F. Chahuneau - General Manager)
Date: Mon Jun 7 16:57:26 2004
Subject: Simple approaches to XML implementation
Message-ID: <199703011819.TAA21136@cygne.ais.berger-levrault.fr>
[from PMR]
>
> ESIS doesn't retain everything from the original document(s) and I've
> been asking the experts what gets lost.
In case someone wants to get even more precise information, ESIS (Element=
Structure Information Set) is fully defined in annex G of document
ISO/IEC/JCT1/SC18/WG8/N1035: Recommendations for a Possible Revision of I=
SO
8879 (SGML). You can find an exact replication of this passage in Charles=
Goldfarb's "SGML Handbook" (Clarendon Press, 1990), pp 588 to 591.
> My rough summary is that > XML->ESIS loses:
> - comments (this matters if you want to edit the document or have
> it read by humans. However comments should not be used
> by machines - simply passed through)
True
> - entities. If your document includes entities such as &chapter1;
> these may be expanded and replaced by their contents. In
> this way some of the structure may be less clear
It's actually more complex than that.
SGML *text* entity references, whether entities are "internal" or
"external", are indeed fully expanded and you are not even notified this =
in
the ESIS event stream. Therefore, ESIS does not convey the "entity
structure" of an SGML document. This is, by the way, irrelevant to most
applications ... except for those, such as some SGML editors, whose purpo=
se
is seen as being able to manipulate SGML documents without arbitrarily
altering their entity structure (in addition to their element structure).=
External data entity references, internal SDATA and PI entity references =
are signaled in the ESIS, while CDATA internal entity references are
expanded without being reported. This may appear as as bizarre design
choice, but there is something even more disturbing: in the case of
internal SDATA entity references, only the entity "replacement value" is =
passed, not the entity "name". This of one of the reasons why ESIS
information, alone, does not allow to implement an "identity
transformation" for SGML documents, even when you don't care about the
physical decomposition of the document into several files (SGML entities)=
.
Note that SDATA entity disappear in XML, so that THIS PROBLEM DISAPPEARS =
AS
WELL!
> - conditional markup. If you use INCLUDE and/or IGNORE then the
> IGNORE'd sections won't come through and the INCLUDE'd
> ones won't be marked as such
True
> [I think that processing instructions come through OK?
True
> And that you can determine whether an attribute value was defaulted
> or not?]
Unfortunately not. This information is unavailable in ESIS, and you would=
need to access some "DTD information set" to be able to recover it. Besid=
es
attribute names and de facto values, the only side information you have i=
n
ESIS is when the value for an #IMPLIED attribute has not been specified.
There is one more piece of information missing in ESIS, and which causes =
a
problem to implement an "identity transformation" for plain SGML document=
s:
you don't know WHICH ELEMENTS HAVE BEEN DECLARED #EMPTY in the DTD. You
may know when an element has null content, but you don't know whether thi=
s
is because it happens to be so (optional content) or because it can't hav=
e
any (declared #EMPTY). Therefore, you do not know whether you should outp=
ut
an end tag for it or not. Again, you would need some "DTD information" to=
disambiguate. Maybe not everyone realized it yet, but this *is* the one a=
nd
only reason why XML introduces this explicit syntax for empty
elements. This, again, makes this problem disappear with XML.
All in all, you can see that some design decisions in XML were precisely =
motivated by the desire to make an ESIS event stream sufficient to
implement an identity transformation, even with no access to DTD
information. This is, of course, totally consistent with the idea that DT=
Ds
should not be systematically needed for processing XML fragments.
Whether you work with an event stream or an abstract tree(*) is orthogona=
l
to this discussion: we are discussing about the *available* information, =
not about the way it is represented. This does not mean that I see abstra=
ct
trees as useless, all the contrary (see my previous mail).
I hope I helped clarify what ESIS was.
(*): I use the term "asbtract tree" instead of "parse tree" to designate =
the "tree of typed nodes with attributes" (you could also say "SGML objec=
t
tree", but this term to be somewhat overloaded these days...). From an SG=
ML
parser's point of view, an SGML "parse tree" would have distinct nodes fo=
r
start tags and end tags, which are not what you are looking for when you =
want a useful representation allowing to cut-and-paste SGML elements (see=
n
as atomic, typed text objects with attached properties).
_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/
_/ Fran=E7ois CHAHUNEAU phone: [+33] 1 40 64 43 00=
_/
_/ Directeur G=E9n=E9ral/General Manager =
_/
_/ AIS S.A. FAX: [+33] 1 40 64 43 10 _/=
_/ 15-17 rue R=E9my Dumoncel email: fcha@ais.berger-levrault.fr _=
/
_/ 75014, Paris, FRANCE WWW: http://www.berger-levrault.fr _/
_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From peter at techno.com Sat Mar 1 18:52:55 1997
From: peter at techno.com (Peter Newcomb)
Date: Mon Jun 7 16:57:26 2004
Subject: XML API specification
In-Reply-To: (dgd@cs.bu.edu)
Message-ID: <199703011844.NAA10478@exocomp.techno.com>
> References:
> Mime-Version: 1.0
> Content-Type: text/plain; charset="us-ascii"
> Date: Fri, 28 Feb 1997 13:50:22 -0500
> From: dgd@cs.bu.edu (David Durand)
> Sender: owner-xml-dev@ic.ac.uk
> Precedence: bulk
> Reply-To: dgd@cs.bu.edu (David Durand)
>
> At 11:53 AM -0600 2/28/97, Len Bullard wrote:
> >David Durand wrote:
> >>
> >> I see XML-groves and XML-API as parallel and needing to be in synch. I
> >> don't see either as having to depend on the other, though, and frankly,
> >> given the relative penetration of groves and Java into the "global
> >> developer consciousness", I don't see groves as that high a priority.
> >
> >If relative penetration is important, spec it in COBOL or C.
> >
> >This kind of argument went on in VRML and was wisely rejected.
> >The commitment to a CORBA IDL is a commitment to a syntax for the spec
> >and not a lot else.
>
> If Gavin's information is correct (and I assume it to be so) this is false.
> IDL means that we get language-specific bindings for several languages
> including Java and C++, simply by applyiing an automated tool. So there are
> concrete technical advantages to using IDL, though we must apply those
> tools for the programmers, so that I don't have to find an IDL tool to use
> XML with my Java codebase.
Grove schemas (property sets) can also be automatically
translated/compiled to provide interface declarations in any language.
We do this at TechnoTeacher to create documentation-compatible
interfaces to groves stored in different ODBMSs, as well as to be able
to provide access to those groves from different languages and
environments. IDL, Java, and C++ can all be generated easily from the
same property set.
It is not necessary that developers using these APIs (in IDL, Java,
C++, etc.) know about groves or property sets. However, if there is
one canonical form of the API (the property set), a developer that
learned his way around the API in C++ will not be confused if he is
subsequently required to use the API in Java, Scheme, SQL, etc.
-peter
--
Peter Newcomb TechnoTeacher, Inc.
233 Spruce Avenue P.O. Box 23795
Rochester, NY 14611-4041 USA Rochester, New York 14692-3795 USA
+1 716 464 8696 (home) +1 716 464 8696 (direct)
+1 716 755 8698 (cell) +1 716 271 0796 (main)
+1 716 529 4304 (fax) +1 716 271 0129 (fax)
peter@petes-house.rochester.ny.us peter@techno.com
http://www.petes-house.rochester.ny.us http://www.techno.com
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From Peter at ursus.demon.co.uk Sat Mar 1 18:54:28 1997
From: Peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun 7 16:57:26 2004
Subject: MIME
Message-ID: <4120@ursus.demon.co.uk>
I believe that the use of MIME types for interacting with legacy data
has great potential for XML. I'd welcome comments on the following
ideas.
Many legacy documents have been registered as MIME types, and are also capable
of being represented as structured documents. This means that an XML
application is capable of reading MIME types on-the-fly and converting to
XML internally. (The only key requirement is that the document description is
well-defined and stable so that it possible to write a DTD (or meta-DTD)
for it.)
I have done this for my JUMBO parser. It is able to read in ~12 MIME types
(belonging to chemistry) in native form. It then converts them into a
Tree object internally and as it parses the document serially,
adds Nodes and Attributes where appropriate. This is isomorphic to the
equivalent XML document and can be displayed in the GUI,
edited, etc. and written out as XML. It is obviously capable of SD searches
as well. The average user therefore sees JUMBO as a universal browser and
possibly as a transformation tool (though _writing_ legacy formats from an
arbitrary tree is usually difficult and information is lost).
The architecture is (fairly) simple. Each MIME type requires a Java
subclass of SGMLTree. As the (FORTRAN) document is read, it is poked into the
nodes as appropriate. One enormous advantage of this is that the order
of the data in the document doesn't cause any problems in writing the
code (whereas for a conventional parser it can be a nightmare - 'have we
already read this section?'). I am still amazed at how valuable this
simple tree-building is. Of course, SD search techniques can then be
used to add contextual information for processssing or the tree can be
reordered, pruned, etc.
I think it would be enormously valuable to have MIME->XML converters for
helping us at the editing stage.
This may be easier than we think. Reading the Java Beans spec (a few months
old, so it may have changed), there are statements like:
'... the [current proposal] .. is that the MIME namespace for data types
shall be used by _DataFlavors_' [an interface for transferable data].
'we want [Java beans] to be able to pretend to be an Excel document inside
a Word document'.
This implies that interfaces (?IDLs) will be produced for common MIME types.
It should therefore be possible to obtain Word, Excel, GIF, RTF, etc. beans.
The XML immplementation would then be:
legacy--[bean]-->JavaInterface--[Java application]-->SDinMemory--[DTD]-->XML
I haven't kept in close touch with Beans, (although I have played with the
beta-release and it's very powerful for what I want to do). If we could
offer Java browsers for common MIME types, with automatic viewing, editing
merging and transformation into XML, it could be a very attractive way of
bringing people into this arena.
P.
[The only downside is that the magic of XML is completely hidden from
the user :-)]
--
Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences
http://www.vsms.nottingham.ac.uk/
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From tbray at textuality.com Sun Mar 2 02:38:36 1997
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun 7 16:57:26 2004
Subject: TEI pointers
Message-ID: <3.0.32.19970301162710.007284c0@pop.intergate.bc.ca>
At 09:58 AM 28/02/97 GMT, Peter Murray-Rust wrote:
>I need a search tool for structured documents and would be
>grateful for pointers to existing tools which are free and re-usable. My
>target language is Java. I would intend to use the TEI syntax (does it
>come in different flavours?).
The only free search tool generally available is WAIS which, while
not bad, is kind of difficult to administer and does not mate well
with SGML.
But then, there are very few *commercial* search tools that mate well
with SGML either. So to get what you want, you'll probably have
to write it.
Since I am a tired old full-text-search guy, Lark takes fanatical care
to keep track of the byte offsets of everything; so there will be at
least one parser that would be useful in such an effort. The fact
that Lark doesn't look at DTD's nor check conformance is not an
issue at indexing time.
>I would also intend to use a graphically-based query if possible as well
>as a commandline. Has this been tried and are there any metaphors which
>have proved to be useful? How do most humans currently construct TEI
>quries? Do they learn the language and use a command line or do they
>get customised queries?
I've never seen a graphical search query GUI that was useful.
- Tim
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From tbray at textuality.com Sun Mar 2 02:38:43 1997
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun 7 16:57:26 2004
Subject: Simple approaches to XML implementation
Message-ID: <3.0.32.19970301162825.006900ac@pop.intergate.bc.ca>
At 07:19 PM 01/03/97 +0100, F. Chahuneau - General Manager wrote:
>All in all, you can see that some design decisions in XML were precisely
>motivated by the desire to make an ESIS event stream sufficient to
>implement an identity transformation, even with no access to DTD
>information. This is, of course, totally consistent with the idea that DTDs
>should not be systematically needed for processing XML fragments.
Whereas I agree with the rest of Francois' contribution, this paragraph
is not quite right. If you change "ESIS event stream" to "Instance
character stream", then it would be more correct. But in fact the
SGML->SGML declaration was not one of our goals; for example, the
processor is not required to tell the app about [at least] comments
and SGML or the
absence of an ESIS equivalent) is a big huge flaw in XML, there's still
time to fix it. The SGML->SGML problem is probably a job for the
XML WG. The ESIS issue is perhaps a job for this list. I personally
think an API is better than an ESIS [even if the ESIS were properly
defined] anyhow.
Cheers, Tim Bray
tbray@textuality.com http://www.textuality.com/ +1-604-708-9592
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From Peter at ursus.demon.co.uk Sun Mar 2 10:28:52 1997
From: Peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun 7 16:57:26 2004
Subject: Simple approaches to XML implementation
Message-ID: <4143@ursus.demon.co.uk>
In message <3.0.32.19970301162825.006900ac@pop.intergate.bc.ca> Tim Bray writes:
[...]
>
> Whereas I agree with the rest of Francois' contribution, this paragraph
> is not quite right. If you change "ESIS event stream" to "Instance
> character stream", then it would be more correct. But in fact the
> SGML->SGML declaration was not one of our goals; for example, the
I hope I haven't muddied the waters here - SGML->SGML was not my intention
either. The (possibly fuzzy) idea was that I (and probably others) are
familiar with ESIS because they use sgmls and 'could this help us in our
search for the ideas that go into the API'. IOW 'could we throw out things
that didn't appear in ESIS?'. Don't worry if it doesn't go anywhere.
> processor is not required to tell the app about [at least] comments
> and merely, in a very abstract way, what the processor has to give the
^^^^^^^^^^^^^
> application.
Fully agreed. I am probably showing my usual impatience in trying to
resolve the 'abstract' to concrete. From a practical point of view if
those people who have written parsers put their heads together and came
up with a suggested API, I would look at it extremely seriously and
positively.
>
> If either of these problems (the impossibility of SGML->SGML or the
> absence of an ESIS equivalent) is a big huge flaw in XML, there's still
> time to fix it. The SGML->SGML problem is probably a job for the
> XML WG. The ESIS issue is perhaps a job for this list. I personally
> think an API is better than an ESIS [even if the ESIS were properly
> defined] anyhow.
Absolutely. From my point of view what Lark provides as an API does what I
want at the moment. Maybe there are things that it doesn't do that it could
or should, but *I don't know about them* :-). Being a concrete person I
understand those 'things' that go into and come out of current programs rather
than more abstract and perhaps more powerful synoptic views.
I talked last week with an important person/organisation in chemical
informatics who is very excited about XML. Their main worry was that it
would become too complicated. I share this concern, though I think it's
also important to make sure that we don't unnecessarily limit the power of
the language. However there is no reason why we shouldn't initially limit
the power of the API if it makes sense. For example [as Tim says] I can do
without the comments and CDATA.
We've had a week to explore the boundaries of the API. The spectrum covers
the use of groves and 10179 (which a lot of us don't understand) to a
smaller set of more 'concrete' things which we are more at home with. If
we take the more abstract approach it's going to take time and faith to
come up with an API. It's probably the 'right' way, and I hope that some
members of this list are trying to systematise their ideas and a way forward.
I also understand and support the IDL approach if it really will produce
Java, C++, etc APIs automatiicaly. *The result of this automatic
conversion must, however, be understandable by humans*.
Being a hacker, I like to do things quickly and suggest that we try to
gather together a 'concrete' API that could be used very shortly. I would
be happy to take NXP and Lark as the starting points and say that they
represent the current functionality that I require. Can they converge to
a common name space? Example: some people on this list use 'Element' where
others use 'Node' - it's agreement at this level that I need. Similarly I need
to know what classes a DTD might supply (is it ElementType or GI, or are
they all bundled under Element?). And can we agree on what the totality
of the information *defined in PhaseI* is?
We don't lose anything by getting this off the ground quickly. It exercises
the language, helps us locate resources and clarifies our thoughts. A
first-generation set of tools will impress the world and maybe might be
extended into more powerful systems. It also helps to build up a core
of documents that act as examples.
P.
--
Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences
http://www.vsms.nottingham.ac.uk/
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From nmikula at edu.uni-klu.ac.at Sun Mar 2 10:29:45 1997
From: nmikula at edu.uni-klu.ac.at (Norbert Mikula)
Date: Mon Jun 7 16:57:27 2004
Subject: NXP is still alive (Sorry for being late)
Message-ID:
To all XML freaks,
I was off the XML-WG list for a few days, our daemon was down,
and I missed that this list was set up.
Just by chance I found the message now on c.t.s. I wonder
why I didn't see it before :-(
However, NXP is not dead. I will now try to sync with
current status of disucussion, especially the API
thread is interesting to me.
An advance announcement, the next official release
of NXP includes catalog support (including DELEGATE and
CATALOG). You can expect it in a few days.....
During the last days I was very busy to redesign
my DSSSL engine YADE.
-----
Best regards,
Norbert H. Mikula
=====================================================
= SGML, DSSSL, Intra- & Internet, AI, Java
=====================================================
= mailto:nmikula@edu.uni-klu.ac.at
= http://www.edu.uni-klu.ac.at/~nmikula
=====================================================
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From Ingo.Macherius at tu-clausthal.de Sun Mar 2 10:56:19 1997
From: Ingo.Macherius at tu-clausthal.de (Ingo Macherius)
Date: Mon Jun 7 16:57:27 2004
Subject: Simple approaches to XML implementation
In-Reply-To: <4111@ursus.demon.co.uk> from "Peter Murray-Rust" at Mar 1, 97 11:13:37 am
Message-ID: <199703021056.LAA02919@florix.rz.tu-clausthal.de>
Peter,
> > > I have made up a perl5 module which models a very simple forest-like strukture,
> > > that holds Perl5 objects. The objects are created by reading nsgmls' ESIS
[...]
> > > I think this may be called a poor-mans-grove :) I made up a simple API:
> > ^^^^^^^^^^^^^^^
> > It's still very powerful, and you have recognised the importance of
> > structured documents. The good news is that this will all be addressed
> > (literally and metaphorically) in the discussion of addressing within
> > XML documents. The TEI project has developed a pointer scheme which
> > covers most aspects of structure and extends the metaphor to descendants,
> > ancestors, siblings and navigation by attributes and their values. I
> > am expecting one or more 'black boxes' to be developed which support this,
> > so that you don't have to write perl scripts any more. I'm waiting to hear
> > from another thread :-)
I wrote the Perl interface because I needed an access to SGML information which
is fast enough for CGI. So I maintain the information base as SGML doc and
"render" it to my homegrown OODB described in the last mail. The information
is updated only once a week or so, so this is a sufficient method.
Jade is too slow, considering the fork the http has to do to start
DSSSL processing. I'd love to use SDQL !
> > > I found this sufficient to solve small problems for which ESIS is not enough
> > ^^^^^^^^^^
> > I think you were operating _on_ the ESIS stream. You mean that simple
> > 'grep' or other tools weren't powerful enough?
I need a persistent representation of structured data. The information would
fit into a RDBMS, so the job could easily be done with mSQL. But I wanted to
find out, if SGML would works, too. It does :)
Yes, I operate on an ESIS stream while *rendering* the doc to my OODB, but
afterwards I operate only on the DB for speed's sake. I'd prefer to do this
on a persistent grove, but implementation would be far more complicated than
my little perl hack :)
> > > really didn`t get the details. But what I found valuable is the choice
> > ^^^^^^^^^^^
> > I think it's very important not to be frightened by 10179. What you have
> > done is very similar to what I and many others have done - devising
> > home-grown tools for searching structured documents. 10179 has an
> > implementation in Scheme (am I right?) but not in more procedural or
> > object-oriented languages.
Hm. I always thought DSSSL is a dialect of Scheme, so this is not q question
of implementation. IMHO it *must* be implemented as scheme. It's allowed to
define other languages that map on corresponding DSSSL/scheme statements,
which have to be submitted to a DSSSL engine. But an engine that calls itself
a DSSSL engine must have a (restricted) scheme engine inside. Correct me
if I am wrong !
I'd be happy to hear about anyone writing a book on DSSSL. I read all examples
I could get from jjc and Jon Bosak, but a structured introduction would help
to convince other people, that do not have the time to read sources.
> > > between navigating (father/son) and id-based lookups (fetch).
This is very important for me, because sometimes I *know* which element I
need, because I get the ID from elsewhere. Any API to XML should offer both,
a navigating query and a mere GOTO.
++im
--
Snail : Ingo Macherius // L'Aigler Platz 4 // D-38678 Clausthal-Zellerfeld
Mail : Ingo.Macherius@tu-clausthal.de WWW: http://www.tu-clausthal.de/~inim/
Information!=Knowledge!=Wisdom!=Truth!=Beauty!=Love!=Music==BEST (Frank Zappa)
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From Peter at ursus.demon.co.uk Sun Mar 2 11:46:49 1997
From: Peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun 7 16:57:27 2004
Subject: Simple approaches to XML implementation
Message-ID: <4149@ursus.demon.co.uk>
In message <199703021056.LAA02919@florix.rz.tu-clausthal.de> Ingo Macherius writes:
[...]
>
> I wrote the Perl interface because I needed an access to SGML information which
> is fast enough for CGI. So I maintain the information base as SGML doc and
> "render" it to my homegrown OODB described in the last mail. The information
One of the selling points of XML should be that it maps directly onto OODBs.
I don't know much about this myself, but I would assume that since the OODB
supports persistence then this is a way of supporting persistence in XML
applications. (Hope this isn't drivel).
> is updated only once a week or so, so this is a sufficient method.
> Jade is too slow, considering the fork the http has to do to start
> DSSSL processing. I'd love to use SDQL !
See below.
>
> > > > I found this sufficient to solve small problems for which ESIS is not enough
> > > ^^^^^^^^^^
> > > I think you were operating _on_ the ESIS stream. You mean that simple
> > > 'grep' or other tools weren't powerful enough?
>
> I need a persistent representation of structured data. The information would
> fit into a RDBMS, so the job could easily be done with mSQL. But I wanted to
> find out, if SGML would works, too. It does :)
> Yes, I operate on an ESIS stream while *rendering* the doc to my OODB, but
> afterwards I operate only on the DB for speed's sake. I'd prefer to do this
> on a persistent grove, but implementation would be far more complicated than
> my little perl hack :)
Again - I suspect that getting the XML/OO interface correct means that we
get gains from both sides.
>
[...]
>
> > > > between navigating (father/son) and id-based lookups (fetch).
> This is very important for me, because sometimes I *know* which element I
> need, because I get the ID from elsewhere. Any API to XML should offer both,
> a navigating query and a mere GOTO.
My simple understanding is that TEI pointers offer all of this. They have
an (SGML) ID which is your GOTO and a sufficiently powerful navigation
system for (my) needs. The description is in the PhaseII documentation
but has not yet been discussed by the WG or ERB. I am hoping that they come
up with something very similar to TEI, since I can understand it!
Here's a simple example from the draft:
ID (a23) DESCENDANT (2 TERM LANG DE)
matches the second TERM element with a LANG (attribute) of DE occurring
within the element with an ID (goto) of a23. TEI should be able to
deal with everything that you have so far specified.
I have asked very recently whether there is an implementation of this
and maybe part of this topic will move to the TEI thread. I suspect that
a key question will be performance and therefore it may be important to decide
whether a document is indexed when parsed (suggestions?) or whether
intermediate search results are cached. Without more experience I won't
speculate. My own primitive system caches results (e.g. once a question
is asked about IDs, it will remeber the ID values in a hashtable for future
queries).
To the experts:
I have read the core SDQL and it seems to have a different syntax from
TEI. So are the two going to be used together? What does SDQL offer
that TEI doesn't (to a simple web hacker?).
P.
--
Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences
http://www.vsms.nottingham.ac.uk/
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From Peter at ursus.demon.co.uk Sun Mar 2 11:46:56 1997
From: Peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun 7 16:57:27 2004
Subject: NXP
Message-ID: <4150@ursus.demon.co.uk>
[I have changed the subject to 'NXP' so that if this parser is specifically
discussed in future there is a placeholder. More general replies should
go the the thread on XML API or elsewhere.]
Welcome Norbert,
You have an honoured position on this list having written a
very impressive piece of code to get us started.
In message Norbert Mikula writes:
[...]
>
> However, NXP is not dead. I will now try to sync with
^^^^^^^^^^^^^^^
:-) Runs fine on my machine!
> current status of disucussion, especially the API
> thread is interesting to me.
I have recently posted suggesting that we should try to consolidate
on a simple API to get us started. My own development depends to a
significant extent on what API I can use after parsing. I want it to
be very clearly separated, because I see a parser as being a 'bolt-in'
tool rather than a component which drives the rest of the application.
(Maybe this isn't possible, but it's worth trying for).
If you have missed any of the postings here they are all hypermailed
(see the list .sig)
P.
--
Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences
http://www.vsms.nottingham.ac.uk/
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From Peter at ursus.demon.co.uk Sun Mar 2 13:25:22 1997
From: Peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun 7 16:57:27 2004
Subject: Listrivia
Message-ID: <4155@ursus.demon.co.uk>
I shall be away next week as guest of the German Chemical Society
in Wurzburg, where I shall be giving a lecture on "Structured
Documents and Hyperlinking in Chemistry" The URL for the conference
is at:
http://schiele.organik.uni-erlangen.de/cic/IuK97/
and my abstract for the Wednesday session is visible under that. I shall
be talking about CML and XML and I hope to give a demonstration.
I shall not be able to mail to this list for a week [sighs of relief].
Henry Rzepa will continue to manage the technical aspects of the list.
[BTW Henry does all the hard work with managing e-mail addresses which
doesn't automatically come to my notice and I am very grateful to him.
He may mail some recommendations to comp.text.sgml about e-mail addresses
since these can cause confusion.]
P.
--
Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences
http://www.vsms.nottingham.ac.uk/
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From tms at ansa.co.uk Sun Mar 2 15:41:29 1997
From: tms at ansa.co.uk (Toby Speight)
Date: Mon Jun 7 16:57:27 2004
Subject: Simple approaches to XML implementation
In-Reply-To: Peter@ursus.demon.co.uk's message of Sun, 02 Mar 1997 09:41:21 GMT
References: <4143@ursus.demon.co.uk>
Message-ID:
Peter> Peter Murray-Rust
Tim> Tim Bray
I'm not signing this, since one of our mailhosts is badly corrupting
text.
>>>>> In article <3.0.32.19970301162825.006900ac@pop.intergate.bc.ca>,
>>>>> Tim wrote:
Tim> Whereas I agree with the rest of Francois' contribution, this
Tim> paragraph is not quite right. If you change "ESIS event stream"
Tim> to "Instance character stream", then it would be more correct.
Tim> But in fact the SGML-> SGML declaration was not one of our goals;
>>>>> In article <4143@ursus.demon.co.uk>, Peter wrote:
Peter> I hope I haven't muddied the waters here - SGML->SGML was not
Peter> my intention either. The (possibly fuzzy) idea was that I (and
Peter> probably others) are familiar with ESIS because they use sgmls
Peter> and 'could this help us in our search for the ideas that go
Peter> into the API'.
I think it's good to have some of these conceptual anchors around - it
helps us know when we're talking about the same things.
Tim> ... for example, the processor is not required to tell the app
Tim> about [at least] comments and says *nothing* about the ESIS, merely, in a very abstract way,
Tim> what the processor has to give the application.
Tim> If either of these problems (the impossibility of SGML->SGML or
Tim> the absence of an ESIS equivalent) is a big huge flaw in XML,
Tim> there's still time to fix it. The SGML->SGML problem is probably
Tim> a job for the XML WG. The ESIS issue is perhaps a job for this
Tim> list. I personally think an API is better than an ESIS [even if
Tim> the ESIS were properly defined] anyhow.
Peter> ... there is no reason why we shouldn't initially limit the
Peter> power of the API if it makes sense. For example [as Tim says]
Peter> I can do without the comments and CDATA.
IMO, the application should be able to decide (preferably at compile
time) whether it is interested in comments etc. We want to enable the
creation of small, efficient applications as well as highly capable
ones; I suggest an approach of providing lots at the parser, but
providing filtering down to ESIS by default.
My mental model has two kinds of application: those that take a well-
formed document and present it to the user, and those that take a
valid document and allow the user to edit and save it. The ability to
perform the identity transform is obviously a requirement for the
latter, whereas an ESIS stream may be sufficient for the former. What
exactly constitutes an identity transform is not entirely clear cut,
though. Is it okay to expand internal CDATA entities? Do we need to
preserve record-end information? (We might want to do this if we will
be running "diff" on the output - for version control systems,
perhaps).
I'd like to see a parser come with a base class for building an
application's event-stream handler, that simply throws away most
events - the application writer overrides the methods he is interested
in. Some of the events, however would have other actions. Two
examples:
1. the default handler for #PCDATA would expand internal CDATA entity
references and splice in marked sections, and pass the result to
the handler for ESIS "data".
2. the default handler for #EMPTY elements would call the handler for
start-tag, then the one for end-tag.
I'm looking at the "Esis" interface in NXP, and I think it could be
modified to act as such a base class. Comments from Norbert would be
appreciated.
The use of the base-class methods as a filter from the XML event
stream to an ESIS stream means that an application could be written[*]
that acts on ESIS events, but could selectively choose events to
handle from a superset of ESIS - could we agree on a suitable
superset?
[*] or an existing application quickly ported - this makes a
convincing argument to me :-)
[In my approach, we could even change the superset without affecting
those applications that run off the subset - and simply extending the
superset shouldn't affect any existing application, because the base
class would simply throw away the new events]
Peter> We don't lose anything by getting txis off the ground quickly.
Peter> It exercises the language, helps us locate resources and
Peteb> clarifies our thoughts. A first-generation set of tools will
Peter> impress the world and maybe might be extended into more
Peter> powerful systems. It also helps to build up a core of
Peter> documents that act as examples.
I'm not comment on this; I just quoted it because I agree :-)
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From fcha at Berger-Levrault.fr Sun Mar 2 15:46:52 1997
From: fcha at Berger-Levrault.fr (F. Chahuneau - AIS)
Date: Mon Jun 7 16:57:27 2004
Subject: Simple approaches to XML implementation
Message-ID: <199703021545.QAA12767@cygne.ais.berger-levrault.fr>
[F. Chahuneau, 01 Mar 1997]
> >All in all, you can see that some design decisions in XML were precise=
ly
> >motivated by the desire to make an ESIS event stream sufficient to
> >implement an identity transformation, even with no access to DTD
> >information. This is, of course, totally consistent with the idea that=
>
> DTDs should not be systematically needed for processing XML fragments.
>
[Tim Bray Sat, 01 Mar 1997]
> Whereas I agree with the rest of Francois' contribution, this paragraph=
> is not quite right. If you change "ESIS event stream" to "Instance
> character stream", then it would be more correct. But in fact the
> SGML->SGML declaration was not one of our goals; for example, the
> processor is not required to tell the app about [at least] comments
> and merely, in a very abstract way, what the processor has to give the
> application.
>
(Hello, Tim)
I suspect we actually agree more than it seems, but I should not have use=
d
the term "identity transformation" without defining it first. My implicit=
definition was very minimal: being able to generate on the output side a=
n
instance which parses according to the same DTD as the instance on the
input side. As you know, not being able to detect EMPTY elements defeats =
this purpose, whereas not knowing whether an attribute was defaulted or
not, though it might be considered as an information loss, is not a probl=
em
according to this definition.
As many other SGML practicioners, I've never considered the fact that CDA=
TA
marked sections (or comments) would not be notified to be as important in=
practice as the previous problem. (From an abtsract point of view, marked=
sections can be seen as a packing scheme allowing to deliver several
"abstract trees" interleaved in a single file... and therefore are not
representable on a single abstract tree. (Maybe this means I have been
thinking in an XML-oriented framework even before it was formalized...
which is probably why I supported the idea so readily!)
Anyway, I will not pursue (at least not here) this dicusssion which
probably deviates from the main purpose of this list. I provided some
precise information about ESIS because the subject was raised, but it is=
clear that its only utility, in this discussion, is to serve as an exampl=
e
of a normalised "event-stream based" interface between a parser an
application, which could inspire more carefully designed interfaces in th=
e
same style. A tool such as Balise, in its communication with SP, requires=
more than ESIS...
The only important message, in all what I said in my previous e-mails, is=
my conviction that a useful API should provide both event-stream *and*
tree-manipulation paradigms. It is true, to some extent, that you can bui=
ld
one from the other, and that this could be done inside the application. B=
ut
implementing this duality *below* the API level offers big advantages, bo=
th
for maximum expressive power/flexibility ... and to avoid everybody to
reinvent the wheel.
[Peter Murray-Rust Sun, 02 Mar 1997 11:32:40]
> My own development depends to a significant extent on what API I can
> use after parsing.I want it to be very clearly separated, because I
> see a parser as being a 'bolt-in' tool rather than a component which
> drives the rest of the application. (Maybe this isn't possible, but it'=
s
> worth trying for).
This is indeed possible and, to my opinion, it's even required. This is =
how Balise is implemented, both with respect to both SP (the SGML parser =
module) and to its new XML "well-formed document scanner" module. The
parsing module should be able to operate in "slave mode", and this should=
be reflected at the API level (i.e. you need a primitive to trigger the
parsing of an SGML document or an XML fragment). This also means you need=
the parser to be reentrant. That was not the case with sgmls, but it was =
fixed with SP, and should not be too hard a requirement for the forthcomi=
ng
generation of XML parsers/scanners.
_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_=
/
_/ Fran=E7ois CHAHUNEAU phone: [+33] 1 40 64 43 00 =
_/
_/ Directeur G=E9n=E9ral/General Manager =
_/
_/ AIS S.A. FAX: [+33] 1 40 64 43 10 _/
_/ 15-17 rue R=E9my Dumoncel email: fcha@ais.berger-levrault.fr _/
_/ 75014, Paris, FRANCE WWW: http://www.berger-levrault.fr _/
_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From Peter at ursus.demon.co.uk Sun Mar 2 16:13:19 1997
From: Peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun 7 16:57:27 2004
Subject: Simple approaches to XML implementation
Message-ID: <4161@ursus.demon.co.uk>
Thank you Francois,
In message <199703021545.QAA12767@cygne.ais.berger-levrault.fr> "F. Chahuneau - AIS" writes:
> [F. Chahuneau, 01 Mar 1997]
[...]
> [Peter Murray-Rust Sun, 02 Mar 1997 11:32:40]
>
> > My own development depends to a significant extent on what API I can
> > use after parsing.I want it to be very clearly separated, because I
> > see a parser as being a 'bolt-in' tool rather than a component which
> > drives the rest of the application. (Maybe this isn't possible, but it'> s
> > worth trying for).
>
> This is indeed possible and, to my opinion, it's even required. This is >
> how Balise is implemented, both with respect to both SP (the SGML parser >
> module) and to its new XML "well-formed document scanner" module. The
> parsing module should be able to operate in "slave mode", and this should>
> be reflected at the API level (i.e. you need a primitive to trigger the
> parsing of an SGML document or an XML fragment). This also means you need>
> the parser to be reentrant. That was not the case with sgmls, but it was >
> fixed with SP, and should not be too hard a requirement for the forthcomi> ng
> generation of XML parsers/scanners.
This is extremely valuable. We appreciate that many groups will be developing
commercial applications that cannot be described in detail, but it's very
useful to know of the general strategies that are being or have been
developed successfully.
Parsing is clearly a process where it should be clear to the user what the
tool does and we shall need to be able to agree on terminology. I imagine
that some of the information above would form part of a technical manual
or developer's kit and that it should mean the same things to everyone.
There are terms that are not part of the XML spec, but are reasonably
associated with or included in an API because of the different ways of
processing XML documents.
P.
--
Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences
http://www.vsms.nottingham.ac.uk/
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From Peter at ursus.demon.co.uk Sun Mar 2 16:47:16 1997
From: Peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun 7 16:57:27 2004
Subject: ERB work on 3.* (Linking Elements) issues
Message-ID: <4163@ursus.demon.co.uk>
Tim,
I'd like you and the ERB to know how much I appreciate the work
the ERB is doing, and also that I think it's a very effective process.
Personally I'm happy to work with whatever comes out - I trust the ERB to
come out with the most workable solution that the associated brainpower
and experience can muster. [I think it's a credit that on xml-dev (which
is discussing how to implement PhaseI) no-one so far has suggested that
the spec got things wrong, or is ambiguously worded, or otherwise
unimplementable.]
In message <3.0.32.19970301183622.00b3fb54@pop.intergate.bc.ca> Tim Bray writes:
> The ERB has now put two meetings work in on this set of issues and is
> nowhere near done. Not surprising, given the importance of the issues.
> One of the factors holding us back a bit has been the fact that the
> discussion in the WG on the 3.* issues has been lacking in both volume
> and depth. Reasons for this might be (a) that the WG is tired (the
> ERB is), (b) that the WG is busy on other things, and (c) that the WG
> has substantially less experience in these issues than in those that
> came up in the XML language discussion.
I cannot answer for anyone else, but I am (c). [I think it's also
going to be a problem in PhaseIII.] I shall (I hope) have something to
say about addressing in PhaseII (I assume that's still to come).
>From my own perspective as a web hacker, I can probably hack solutions to most
of the proposals so far, so what matters is whether:
(a) people outside the WG, outside SGML, will understand the result.
(b) any decision is more constraining than any other.
At present I am implementing the simplest approaches (HREF-like and IMG-like)
in JUMBO and can probably manage your next lot (with a struggle, and not
very efficiently, but that's not the point). As long as the rules are
clear, whether we have link information in attributes, GIs, contents or the
whole lot is probably manageable. It's more a question of whether confusion
will result.
[...]
As I mentioned on xml-dev I was talking to an important organisation in our
community who were very keen on XML, but 'hoped [the ERB/EG] didn't make it
too complicated'. In a sense, therefore, there are already two levels of
indirection - people like me have to understand it and then carry the message
to a wider community. If _they_ in term have to educate staff, the system
needs to be fairly self-explanatory. Where possible, therefore, I will
cast a meta-vote in favour of the 'most obviously understandable solution
(without prior SGML/HyTime knowledge)'.
To this end, any short example documents illustrating your conclusions so
far would be extremely valuable. Essentially: 'This is what we are
suggesting: can you (a) understand what it is meant to do? (b) think it
can be implemented? (c) do everything that you want to do? (even if some
solutions creak a bit).' We could then try to feed back on these (more
concrete) documents.
P.
--
Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences
http://www.vsms.nottingham.ac.uk/
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From bosak at atlantic-83.Eng.Sun.COM Mon Mar 3 01:16:54 1997
From: bosak at atlantic-83.Eng.Sun.COM (Jon Bosak)
Date: Mon Jun 7 16:57:27 2004
Subject: Call for WWW6 demos
Message-ID: <199703030115.RAA27815@boethius.eng.sun.com>
As most of you know, the World Wide Web conference in Santa Clara
(April 7-11) is a major event for XML and DSSSL. The XML-link draft
spec will be announced, and the Web community will for the first time
be made aware that alternative delivery strategies for structured
documents are becoming a possibility. In addition to a report on the
SGML Activity in the W3C track during the conference itself, there is
going to be a full-day workshop on structured document delivery on the
Monday beginning the conference week and a full-day session on XML and
DSSSL during Developer's Day on the Friday of that week.
Another basic component of this introduction to structured document
alternatives will be a forum in which experimenters and companies
taking an early lead in XML or DSSSL can demonstrate their products
and projects. Early in conference planning I arranged with the WWW6
coordinators to hold open Thursday evening for a session that would
showcase current efforts for the relatively small but important subset
of conference attendees actively involved in Web development. Now
it's time to see just who intends to be there and what they will need
in the way of facilities so that an appropriate room assignment can be
made.
If you or your organization have an XML- or DSSSL-related product or
technology to present at the conference -- an XML parser, an XML
editor, a DSSSL browser, or what-have-you -- please send me a message
with the following information:
Name of organization (if any)
Contact information for responsible person
Description of product or technology for my information
Description of product or technology for public announcement (if
different from above -- please be clear about what can and can't
be stated publicly)
Facilities needed to demonstrate the technology (e.g., lcd
projector, Internet connection)
Whether you or a person from your organization will be
demonstrating the technology or whether you want someone else to
demonstrate it in your absence (I can't make any guarantees in the
latter case, but I'll see what can be arranged)
Try to have this information to me by Sunday evening, March 9, so that
I can make room arrangements and public announcements. If you contact
me after that time, I will make every effort to accommodate you, but
it may not be possible to fit you into the schedule. Unless you tell
me otherwise, I will assume that it is OK to include a description of
your product or project in public announcements and on the conference
Web site. If you are among the half-dozen people who indicated to me
earlier that you intended to have something to demonstrate, please
send me a message anyway to confirm your intention and provide the
information I need in a uniform format.
For maximum coverage I am posting this message to both the sgml-wg and
xml-dev lists. Please use the xml-dev list for followups, if any. I
will post a summary of what I've received to the xml-dev list at the
beginning of next week.
Jon
----------------------------------------------------------------------
Jon Bosak, Online Information Technology Architect, Sun Microsystems
----------------------------------------------------------------------
2550 Garcia Ave., MPK17-101, | Best is he that inuents,
Mountain View, California 94043 | the next he that followes
Davenport Group::SGML Open::ANSI X3V1 | forth and eekes out a good
::ISO/IEC JTC1/SC18/WG8::W3C SGML ERB | inuention.
----------------------------------------------------------------------
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From gtn at ebt.com Mon Mar 3 14:05:31 1997
From: gtn at ebt.com (Gavin Nicol)
Date: Mon Jun 7 16:57:27 2004
Subject: XML API specification
In-Reply-To: (dgd@cs.bu.edu)
Message-ID: <199703031402.JAA28079@nathaniel.ebt>
>>This kind of argument went on in VRML and was wisely rejected.
>>The commitment to a CORBA IDL is a commitment to a syntax for the spec
>>and not a lot else.
>
>If Gavin's information is correct (and I assume it to be so) this is false.
>IDL means that we get language-specific bindings for several languages
>including Java and C++, simply by applyiing an automated tool. So there are
>concrete technical advantages to using IDL, though we must apply those
>tools for the programmers, so that I don't have to find an IDL tool to use
>XML with my Java codebase.
JAVA, C, C++, ADA (and if you use ILU, a whole lot more)
>> The commitment to JAVA for implementation
>>is only a commitment to a slow language.
>
>Again, verifiably false. There is no reason that native-code Java compilers
>cannot exist. Languages aren't slow -- implementations are. Something you
>learn sometime in your first 2 years of college...
There is already an i86 native code compiler, and I hear that the
FSF is working on incorporating JAVA into GCC.
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From richard at light.demon.co.uk Mon Mar 3 14:12:11 1997
From: richard at light.demon.co.uk (Richard Light)
Date: Mon Jun 7 16:57:27 2004
Subject: XML API spec
In-Reply-To: <4109@ursus.demon.co.uk>
Message-ID:
In message <4109@ursus.demon.co.uk>, Peter Murray-Rust
writes
>In message <5lx6vCAzGtFzEwII@light.demon.co.uk> Richard Light writes:
>[...]
>> Operation
>> ---------
>> Presumably the XML processor is a 'slave' to the application, and
>> only does what it's told to.
>
>I think that's right. OTOH it may be that it's possible to build a parser
>that only does one thing and that the application decides what use to make
>of the output. sgmls is rather like this - you either get the ESIS stream
>or nothing (except error messages :-).
I think that the "ESIS stream or nothing" case can (and should) still be
seen in API terms. Essentially such a parser can have a very simple API
with three commands:
"open this XML document/fragment"
"deliver me the whole tree structure in ESIS format"
"close this XML document/fragment"
Looking at it this way, I'm confident that the existing implementations
can be developed to have an 'API', and we'll be on our way.
The advantage of this approach is that it is easy to extend the command
set to match the capabilities of the parser. For example, if the parser
becomes capable of deciding whether or not to include marked sections or
comments in its "ESIS" stream, then the "deliver me the whole tree
structure in ESIS format" command can be refined to have arguments that
determine which features the user wants delivered. (And in fact, this
is exactly what a 'grove plan' is (as I understand it). "Give me
elements, attributes, external entities, data content." It's a pretty
obvious concept: shame about the air of mystery around it!)
The other big issue to resolve at this stage is what in format the
parser ("XML processor") should deliver information to the application.
And that leads us (me, anyway!) to consider the "division of labour"
issue. ESIS gives us a rather strange precedent, which perhaps think we
shouldn't take too much as gospel, even if we are all very used to it.
In the most general terms, the parser ("XML processor") has to deliver
information about the XML document to the application. In ESIS a
sequence of textually-represented tokens indicate parsing events from
which an application can deduce the tree structure that is the XML
document: element start, element end, data content, new line in source
file, e.g.:
(SOURCEDESC
AID IMPLIED
AN IMPLIED
ALANG IMPLIED
AREND IMPLIED
ATEIFORM CDATA p
(P
-Generated from ASCII file by an OmniMark script
)P
)SOURCEDESC
L8
This approach means that the application has to stay on its toes if it
wants to get the structure right. And, fundamentally, it means that the
_application_ has the job of building the tree, whether it wants/needs
to or not.
In the SGML world, this is perhaps a reasonable division of labour,
since the parser has already done a lot of work for the application by
inferring omitted end-tags, shortrefs, etc, etc. However, the whole
point of XML is to _remove_ all of this complexity.
I would therefore argue that in the XML world it is reasonable to ask
the parser ("XML processor") to do a bit more: to "build the tree" and
then let the application cherry-pick the bits it requires.
Having resolved that (which we havn't - comments please!), we still have
the delivery issue. I think a valuable aspect of the ESIS aproach is
that the output is textual in nature. In principle, we could have a
sequence of (binary) "objects" with "properties", splurging out of the
parser, but to do so would in my view limit the usefulness of the output
to a specific application environment. Bad thing!
So, what does our "textual" output look like? As I said above, ESIS is
a rather strange precedent. It uses a set of conventions all of its
own:
- a newline for certain events (but not for all);
- first character of the line is an ESIS-specific code for the event
type ("(" = start-tag; "-" = data content, etc.);
- character entities represented (uselessly) by their mapping;
- etc.
A much simpler approach, which I _think_ is what would happen in a
DSSSL-style transformation, is for the parser simply to output tidied-up
XML. In which case, you might ask, what the heck is the parser doing
for us? To which I would reply "about the same as what ESIS is doing
for you!"
The value of the parser will be apparent once it is able to filter out
and deliver:
- exactly those properties of the XML that the appplication is
interested in;
- any required subtree from the full document
>> View of the XML document
>> ------------------------
>> What does the application 'see' of the XML document it has asked the
>> XML processor to open? The spec implies that it should have pretty
>> direct access, e.g.:
>>
>> "An XML processor must inform the application of the length of
>> comments if they are not passed through, to enable the application
>> to keep track of the correct location of objects in the XML
>> document."
>>
>> This fills me with gloom. Shouldn't there be a level of abstraction
> ^^^^^^^^^^
>It would fill me with gloom _if I had to write the parser_ :-). If someone
>else has done this, and didn't mind doing it, and if the result made it
>trivial to discard comments (or other information) then it's not a problem.
Sorry, Peter, I didn't make my point clearly. The "gloom" related to
the phrase: "... to enable the application to keep track of the correct
location of objects in the XML document". In my view of things, the
application should _never_ have, or need, direct access to the actual
XML document. It should get _everything_ it needs through the API.
In the context of an editing application, where one might think the
application needed to "poke" new or changed data directly into the XML
document, I would argue that the parser should be performing a read-only
operation on the source XML. If an editor is letting the user make
changes, it is on an _in-memory copy_ of the source document (which
clearly, as several of us have noted, needs to be a complete copy).
When the user of the XML editing application decides to save their
edited result, they will be overwriting the source XML document on disc
with their in-memory copy, just as you do with any word processor.
There is no need for the parser ("XML processor") to be involved in this
stage of the process at all.
Richard Light.
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From gtn at ebt.com Mon Mar 3 14:38:44 1997
From: gtn at ebt.com (Gavin Nicol)
Date: Mon Jun 7 16:57:28 2004
Subject: Simple approaches to XML implementation
In-Reply-To: <4143@ursus.demon.co.uk> (Peter@ursus.demon.co.uk)
Message-ID: <199703031436.JAA28091@nathaniel.ebt>
>I also understand and support the IDL approach if it really will produce
>Java, C++, etc APIs automatiicaly. *The result of this automatic
>conversion must, however, be understandable by humans*.
The C++/JAVA bindings tend to be quite eaasy to read (basically
just proxying objects).
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From cbullard at hiwaay.net Mon Mar 3 14:43:00 1997
From: cbullard at hiwaay.net (Len Bullard)
Date: Mon Jun 7 16:57:28 2004
Subject: XML API specification
References: <199703031402.JAA28079@nathaniel.ebt>
Message-ID: <331AE0C4.1A1E@hiwaay.net>
Gavin Nicol wrote:
>
> >>This kind of argument went on in VRML and was wisely rejected.
> >>The commitment to a CORBA IDL is a commitment to a syntax for the spec
> >>and not a lot else.
> >
> >If Gavin's information is correct (and I assume it to be so) this is false.
> >IDL means that we get language-specific bindings for several languages
> >including Java and C++, simply by applyiing an automated tool. So there are
> >concrete technical advantages to using IDL, though we must apply those
> >tools for the programmers, so that I don't have to find an IDL tool to use
> >XML with my Java codebase.
>
> JAVA, C, C++, ADA (and if you use ILU, a whole lot more)
Again, "What it means to the spec". Available tools are the next level.
Groves to IDL to Whatever is still the food chain. Committing directly
to Java is what is wrong in the previous posted suggestion. As David
says, "we are in raging agreement". Unless we leave the API adaptible
to other languages, we lose too many well-known and practiced
optimization
advantages. So, C, C++, yes even ADA, are still possibilities.
> >> The commitment to JAVA for implementation
> >>is only a commitment to a slow language.
> >
> >Again, verifiably false. There is no reason that native-code Java compilers
> >cannot exist. Languages aren't slow -- implementations are. Something you
> >learn sometime in your first 2 years of college...
>
> There is already an i86 native code compiler, and I hear that the
> FSF is working on incorporating JAVA into GCC.
Glad to hear it. Have you ever read the FAR and its regulations for
using commercial software? These don't matter to academic development
efforts, but to the commercial software business they are of some
importance. So, forgive me if I keep pushing toward the commercial
requirements. Java is fine. FSF is food for the hungry.
There are alternatives that must be considered.
IDL looks to be the best candidate for the implementors. I think a
grove
definition provides good spec language and makes it easier to align XML
with the Technical Corrigendums from WG8. Let each party read the
verbiage that works best for them.
len
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From richard at light.demon.co.uk Mon Mar 3 16:07:36 1997
From: richard at light.demon.co.uk (Richard Light)
Date: Mon Jun 7 16:57:28 2004
Subject: Simple approaches to XML implementation
In-Reply-To:
Message-ID:
In message , Toby Speight
writes
>IMO, the application should be able to decide (preferably at compile
>time) whether it is interested in comments etc. We want to enable the
>creation of small, efficient applications as well as highly capable
>ones; I suggest an approach of providing lots at the parser, but
>providing filtering down to ESIS by default.
Bear in mind that, no matter how much or how little the application is
interested in, the parser has to chew its way sequentially through the
whole darned XML document. In a way, the only efficiency issue is how
much it "remembers" en route. And whether it can stop because it knows
it has found the element, entity reference, or whatever, that the
application told it to sniff out.
Richard Light
SGML and Museum Information Consultancy
richard@light.demon.co.uk
3 Midfields Walk
Burgess Hill
West Sussex RH15 8JA
U.K.
tel. (44) 1444 232067
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From gtn at ebt.com Mon Mar 3 17:35:26 1997
From: gtn at ebt.com (Gavin Nicol)
Date: Mon Jun 7 16:57:28 2004
Subject: Simple approaches to XML implementation
In-Reply-To: (message from Richard Light on Mon, 3 Mar 1997 11:09:30 +0000)
Message-ID: <199703031732.MAA28163@nathaniel.ebt>
>Bear in mind that, no matter how much or how little the application is
>interested in, the parser has to chew its way sequentially through the
>whole darned XML document. In a way, the only efficiency issue is how
>much it "remembers" en route. And whether it can stop because it knows
>it has found the element, entity reference, or whatever, that the
>application told it to sniff out.
At least, unlike SGML, and XML parser can delay entity resolution.
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From digitome at iol.ie Mon Mar 3 19:32:20 1997
From: digitome at iol.ie (Digitome Ltd.)
Date: Mon Jun 7 16:57:28 2004
Subject: Simple approaches to XML implementation
Message-ID: <199703031932.TAA20716@mail.iol.ie>
[Tim Bray]
>Whereas I agree with the rest of Francois' contribution, this paragraph
>is not quite right. If you change "ESIS event stream" to "Instance
>character stream", then it would be more correct. But in fact the
>SGML->SGML declaration was not one of our goals; for example, the
>processor is not required to tell the app about [at least] comments
>and merely, in a very abstract way, what the processor has to give the
>application.
>
>If either of these problems (the impossibility of SGML->SGML or the
>absence of an ESIS equivalent) is a big huge flaw in XML, there's still
>time to fix it. The SGML->SGML problem is probably a job for the
>XML WG. The ESIS issue is perhaps a job for this list. I personally
>think an API is better than an ESIS [even if the ESIS were properly
>defined] anyhow.
I am a big fan on SGML->SGML and would like to see the ESIS powerful
enough to allow it at some level (i.e. even with certain restrictions is
better than not at all). I think it is important that the API - whatever
form it finally takes - is not merely an API for *rendering XML*. What about
all the XML processing apps that will be slurping XML prior to any
XML publishing.
Another point in favour of an ESIS rather than a function/method API is that
the ESIS approach is automatically bi-directional. I.e. can be used to create
XML as well as process it.
Sean
Sean Mc Grath
digitome@iol.ie
Digitome Electronic Publishing
Developers of IDM - Next Generation SGML Transformation Technology
http://www.screen.ie/digitome
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From richard at light.demon.co.uk Tue Mar 4 10:33:08 1997
From: richard at light.demon.co.uk (Richard Light)
Date: Mon Jun 7 16:57:28 2004
Subject: Simple approaches to XML implementation
In-Reply-To: <199703031932.TAA20716@mail.iol.ie>
Message-ID:
In message <199703031932.TAA20716@mail.iol.ie>, "Digitome Ltd."
writes
>I am a big fan on SGML->SGML and would like to see the ESIS powerful
>enough to allow it at some level (i.e. even with certain restrictions is
>better than not at all). I think it is important that the API - whatever
>form it finally takes - is not merely an API for *rendering XML*. What about
>all the XML processing apps that will be slurping XML prior to any
>XML publishing.
>
>Another point in favour of an ESIS rather than a function/method API is that
>the ESIS approach is automatically bi-directional. I.e. can be used to create
>XML as well as process it.
I think we need both. Surely the API is the set of commands, switches,
etc. which the application can use to control the behaviour of the XML
processor and issue requests to it, while the "ESIS" is the well-
understood format in which the XML processor serves up the requested
results to the application?
Is it fair to say that the XML API is functionally equivalent to the
command line arguments in NSGMLS, while the "XML ESIS" is (more
obviously) equivalent to the ESIS output by NSGMLS? That's how I tend
to see it.
The advantage of an API over an NSGMLS-style command line is that you
can have any number of bites at the cherry, retrieving relevant bits of
the XML document each time. For example, a browsing app might start by
requesting the only element structure for the whole document (to fill an
'outline' window), then go back and ask for content for the first few
elements until it had enough to fill a 'data window'.
Richard Light
SGML and Museum Information Consultancy
richard@light.demon.co.uk
3 Midfields Walk
Burgess Hill
West Sussex RH15 8JA
U.K.
tel. (44) 1444 232067
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From housel at ms7.hinet.net Tue Mar 4 16:04:52 1997
From: housel at ms7.hinet.net (Peter S. Housel)
Date: Mon Jun 7 16:57:28 2004
Subject: Simple approaches to XML implementation
Message-ID: <199703041556.XAA06348@ms7.hinet.net>
Here's what I'm looking for in an XML/SGML API.
First I want an entry point that allows the application to query
the parser implementation what property set modules it supports
(i.e., what's the richest grove plan available to users), and whether
or not validation is available.
Next, there should be an XMLEventStream object. To create an
XMLEventStream, you specify:
1. The URL of the document to open.
2. The grove plan you want, in the form of a list of property set modules.
3. Whether or not you want validation done.
This gives a stream-based interface to the XML document. By default, the
grove plan would be {baseabs, prlgabs0, instabs}, which gives you a stream
of XMLEvent objecs that corresponds almost exactly to ESIS.
If you ask for more modules than that, XMLEventStream will give you a
richer
set of objects, a stream including such things as the contents of the DTD,
comments,
or whatever you like.
As a layer above that, there should be a grove-based interface that takes
the
stream and turns it into a grove. Once built, the grove can be examined
using
an interface similar to SDQL. As someone has already noted, there should
be concrete subclasses of the Node class, but the property-getting
interface should
work whether the property is stored in a list of properties or in a special
instance
slot.
I'm very fond of the idea of mapping documents of various MIME types onto
XML documents.
The translator could work as an XMLEventStream, making the grove-building
machinery
common to all document types.
Still higher application-specific layers could be built easily.
Am I way off base here? I know this is the kind of interface that would
make
me happy...
-Peter S. Housel- housel@ms7.hinet.net
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From dgd at cs.bu.edu Tue Mar 4 16:50:52 1997
From: dgd at cs.bu.edu (David Durand)
Date: Mon Jun 7 16:57:28 2004
Subject: Simple approaches to XML implementation
In-Reply-To:
References: <199703031932.TAA20716@mail.iol.ie>
Message-ID:
At 9:55 AM +0000 3/4/97, Richard Light wrote:
>I think we need both. Surely the API is the set of commands, switches,
>etc. which the application can use to control the behaviour of the XML
>processor and issue requests to it, while the "ESIS" is the well-
>understood format in which the XML processor serves up the requested
>results to the application?
ESIS and the text format that sgmls (and now SP) server up are different
things. The ESIS is an informal, non-normative definition of information
that an SGML application can see. The text format is one way to transmit
that information.
I am with the rest on requiring the potential for XML->XML transformation,
One reason that I pressed to have no insignificant whitespace -- because
it's only parts of the document that you can't see that can bite you.
Personally, I think we need an API with more power than ESIS, and
secondarily should strongly consider a tree-style representation that can
be optionally produced.
>Is it fair to say that the XML API is functionally equivalent to the
>command line arguments in NSGMLS, while the "XML ESIS" is (more
>obviously) equivalent to the ESIS output by NSGMLS? That's how I tend
>to see it.
I think that the API includes 1 call for each kind of information that can
pass between the parser and the application, and _also_ an interface for
setting options.
>The advantage of an API over an NSGMLS-style command line is that you
>can have any number of bites at the cherry, retrieving relevant bits of
>the XML document each time.
This is the advantage of a parse tree-style representation -- But is likely
to be too slow for simple callbacks -- re-parsing documents moves to much
data to be attractive unless you're way memory limited.
It's actually a good argument for a way to request that a stored tree be
traversed to produce callbacks just as if a parse were being created.
> For example, a browsing app might start by
>requesting the only element structure for the whole document (to fill an
>'outline' window), then go back and ask for content for the first few
>elements until it had enough to fill a 'data window'.
-- David
_________________________________________
David Durand dgd@cs.bu.edu \ david@dynamicDiagrams.com
Boston University Computer Science \ Sr. Analyst
http://www.cs.bu.edu/students/grads/dgd/ \ Dynamic Diagrams
--------------------------------------------\ http://dynamicDiagrams.com/
MAPA: mapping for the WWW \__________________________
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From gtn at ebt.com Tue Mar 4 17:35:53 1997
From: gtn at ebt.com (Gavin Nicol)
Date: Mon Jun 7 16:57:28 2004
Subject: Simple approaches to XML implementation
In-Reply-To: (dgd@cs.bu.edu)
Message-ID: <199703041733.MAA04237@nathaniel.ebt>
>Personally, I think we need an API with more power than ESIS, and
>secondarily should strongly consider a tree-style representation that can
>be optionally produced.
Agreed.
>I think that the API includes 1 call for each kind of information that can
>pass between the parser and the application, and _also_ an interface for
>setting options.
I would tend toward an event-driven interface, and an
option-setting interface as the core parser API. For example:
class XMLEventHandler {
public boolean OnComment(String comment);
public boolean OnElementStart(...)
....
}
class XMLParser {
...
parser(XMLEventHandler handler);
...
}
I have some code now that does this, and it works very well.
>It's actually a good argument for a way to request that a stored tree be
>traversed to produce callbacks just as if a parse were being created.
One kind of handler I have is one that build a tree: it also
happens to implement the XMLEventGenerator interface so that
I can use it to feed an event handler.
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From peter at techno.com Tue Mar 4 17:48:28 1997
From: peter at techno.com (Peter Newcomb)
Date: Mon Jun 7 16:57:28 2004
Subject: Simple approaches to XML implementation
In-Reply-To: <199703041556.XAA06348@ms7.hinet.net> (housel@ms7.hinet.net)
Message-ID: <199703041745.MAA11686@exocomp.techno.com>
> From: "Peter S. Housel"
> Date: Wed, 5 Mar 1997 00:05:30 +0800
>
> Here's what I'm looking for in an XML/SGML API.
>
> First I want an entry point that allows the application to query
> the parser implementation what property set modules it supports
> (i.e., what's the richest grove plan available to users), and whether
> or not validation is available.
This should be implemented both as a set of API methods (for tightly
coupled applications) and as a machine-readable (SGML) system
declaration document (for remote and/or code-incompatible
applications), and (for SGML/HyTime systems) I think should also
include information about what storage managers, architecture engines,
and other notation processors (for multimedia addressing) are
available.
> Next, there should be an XMLEventStream object. To create an
> XMLEventStream, you specify:
>
> 1. The URL of the document to open.
> 2. The grove plan you want, in the form of a list of property set modules.
> 3. Whether or not you want validation done.
>
> This gives a stream-based interface to the XML document. By default, the
> grove plan would be {baseabs, prlgabs0, instabs}, which gives you a stream
> of XMLEvent objecs that corresponds almost exactly to ESIS.
>
> If you ask for more modules than that, XMLEventStream will give you
> a richer set of objects, a stream including such things as the
> contents of the DTD, comments, or whatever you like.
>
> As a layer above that, there should be a grove-based interface that
> takes the stream and turns it into a grove. Once built, the grove
> can be examined using an interface similar to SDQL. As someone has
> already noted, there should be concrete subclasses of the Node
> class, but the property-getting interface should work whether the
> property is stored in a list of properties or in a special instance
> slot.
Agreed. However I do not think that an API specification should
dictate whether the grove is built from the event stream or the event
stream from the grove; I would regard that as an implementation issue
since some applications may choose to store documents as character
streams and others as groves (or collections of objects similar to
groves). The important thing is that both APIs (event stream and
grove) be provided.
> I'm very fond of the idea of mapping documents of various MIME types
> onto XML documents. The translator could work as an XMLEventStream,
> making the grove-building machinery common to all document types.
>
> Still higher application-specific layers could be built easily.
I'm not sure exactly what you mean by "mapping documents of various
MIME types onto XML documents" though I would be interested to know.
I do believe, however, that an API designed along these lines would
make a wide variety of applications both possible and relatively easy
to implement.
> Am I way off base here? I know this is the kind of interface that would
> make
> me happy...
What you suggest is _exactly_ in line with what I want and have been
developing.
-peter
--
Peter Newcomb TechnoTeacher, Inc.
233 Spruce Avenue P.O. Box 23795
Rochester, NY 14611-4041 USA Rochester, New York 14692-3795 USA
+1 716 464 8696 (home) +1 716 464 8696 (direct)
+1 716 755 8698 (cell) +1 716 271 0796 (main)
+1 716 529 4304 (fax) +1 716 271 0129 (fax)
peter@petes-house.rochester.ny.us peter@techno.com
http://www.petes-house.rochester.ny.us http://www.techno.com
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From ddb at criinc.com Tue Mar 4 18:32:10 1997
From: ddb at criinc.com (Derek Denny-Brown)
Date: Mon Jun 7 16:57:28 2004
Subject: Simple approaches to XML implementation
Message-ID: <3.0.32.19970304102700.0091a190@mailhost.criinc.com>
At 12:05 AM 3/5/97 +0800, Peter S. Housel wrote:
>First I want an entry point that allows the application to query
>the parser implementation what property set modules it supports
>(i.e., what's the richest grove plan available to users), and whether
>or not validation is available.
>
>Next, there should be an XMLEventStream object. To create an
>XMLEventStream, you specify:
>
>1. The URL of the document to open.
>2. The grove plan you want, in the form of a list of property set modules.
>3. Whether or not you want validation done.
I would much rather abstract the document source, or talk about a
'Provider' or womthing like that. It is usefull to be able to just hand
the parser a URL but that is not the only way I want to pass it documents,
and I may want it to use my URL handling code (for a shared cache) etc. It
makes it more complicated but it increases the flexibilty tremendously. I
agree that a grove plan is a good way to talk about event options, but it
is not the best way to pass it around. I think htere should be an (set of)
object(s) which describe the 'grove plan' and instruct the parser what it
should hand off to the event handler. The parser is then free to inform
the event handler/application that it does not support certain operations.
At 12:33 PM 3/4/97 -0500, Gavin Nicol wrote:
>>I think that the API includes 1 call for each kind of information that can
>>pass between the parser and the application, and _also_ an interface for
>>setting options.
>
>I would tend toward an event-driven interface, and an
>option-setting interface as the core parser API. For example:
>
> class XMLEventHandler {
> public boolean OnComment(String comment);
> public boolean OnElementStart(...)
> ....
> }
>
> class XMLParser {
> ...
> parser(XMLEventHandler handler);
> ...
> }
This would be the best (performance wise) way to do this. It works for SP
and I found it very easy to deal with in that environment.
-derek
--------------------------------------------------------------
ddb@criinc.com || software-engineer || www/sgml/java/perl/etc.
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From dgd at cs.bu.edu Tue Mar 4 18:50:51 1997
From: dgd at cs.bu.edu (David Durand)
Date: Mon Jun 7 16:57:28 2004
Subject: API thoughts...
Message-ID:
I was thinking that my earlier comments have been a bit too abstract, and
Richard's post got me thinking about what kinds of calls we might like to
have ... so I'm going to post some incomplete Java declarations that
express the kind of protocol that I'm suggesting. Details are not the issue
here, but the overall structure is what I'm proposing. Norbert's parser is
rather similar to this in some respects, as far as I've seen (unpacked +
browsed source, but not executed yet).
/** A XMLParser can be constructed with explicit or default options, and
always takes an XMLBuilder as an argument. The XMLBuilder is an interface
that implements a callback for each significant event that the parser
detects.
I think we should also provide a dummy starter class that implements
XMLBuilder, and implements null operations on each event -- otherwise we're
making implementors perform a typing exercise for events they don't care
about.
*/
public interface XMLParser {
/** You can't put constructors in an interface, but the idea should be
clear. I'm not sure how Java maps to IDL anyway... */
Parser(XMLBuilder builder); // make a Parser to callback builder with
// all options set to defaults
Parser(XMLBuilder builder, XMLOptions options);
// here we also set the options
public void parse(String url_start); // start parsing a document
public void parse(InputStream input); // start parsing a document
// Methods like this may require a base URL argument. They also
// might not make sense in the public interfaces...
public void set_options(XMLOptions new_options); // change parsing options
// One way to handle entity resolution is to make it part of the
XMLBuilder
// API, but it may be better to instead have a method like the following.
// ... And of course a new "protocol object to encapsulate the operations
public void set_entity_resolver(XMLResolver resolver);
// Set external resolution strategy
}
/** If you pass an XMLTreeBuilder to an XMLParser it will create an
XMLDocumentTree object, and return it to you, letting you keep the results
of a parse. */
public class XMLTreeBuilder implements XMLBuilder {
public XMLDocumentTree product(); // return the built tree after a parse.
/* ... XMLBuilder operations omitted ... */
}
/** An XMLDocumentTree should be the start of a nest of document
representation classes. I don't have many special ideas here, and you all
probably have a better idea about how it should work than I do.
My one idea, is that it should be able to drive a Builder just the same
way that a parser does.
I'm not sure whether we should be providing classes like this, or if
everything should be an interface....
*/
public class XMLDocumentTree {
/** This method takes options, and runs a builder over the document
tree calling the builder for the virtual events found during traversal. Can
be useful, if you want to build several different views of a document,
without building them all in a single pass. */
public void traverse(XMLBuilder builder); // traverse the tree with
// standard options
public void traverse(XMLBuilder builder, XMLOptions options);
// traverse with specified options.
public XMLDocumentElement access_TEI_location(String TEIpointer);
// We probably won't make methods like this part of the
// public interface
/* actual data access methods to be determined...
I see two main approaches to creating the data access methods:
1. to create a bunch of particular objects Element, Attribute, etc. and
allow looking at them directly. This does make for a rather fat interface,
and a lot of objects. In some contexts this is good (low object coupling),
in others, bad (currently applets pay a high price for using many classes,
and this will take at least a year to improve).
2. Create a general node object that can represent an element or
attribute or entity, etc, and use a general protocol to explicity test and
act on node types, and to traverse. This is essentially the grove model, as
I understand it. The disadvantage is that it's not very concrete, and so
it's harder to understand. You also lose the ability to use type-based
dispatching if your programming style favors it -- you have to test the
generic nodes yourself.
Either of these models is good, but we need to examine the tradeoffs
much more carefully and explicitly.
*/
}
/** Simple class that holds flags and other options for an XML parse or
tree traversal. Default values are made by intitilization and can be
overridden by subclassing and overriding, or by simply assigning values. */
public class XMLOptions {
// there should be flags for each individual type of event. Since that will
// be a lot of flags, we should consider having some flags that lump
together
// frequently occurring options. e.g.:
public boolean visit_elements = true; // Visit elements
public boolean element_start = true; // element open events
public boolean element_end = true; // element close events
public boolean expand_external_entities = true; // Should external
entities
// be automatically expanded?
// ....
}
public interface XMLBuilder {
// I've included a DocumentPosition for each item that has content. This
// This is for full-text indexers, and the like.
public void start_element(String name); // an element began
public void attribute(String name, String value,
AttributeDeclarationInfo attinfo); // attinfo may be null if
the DTD
// was not parsed, or the parser was requested to
// discard such information.
public void internal_entity_reference(String name, String value,
String type);
// Some applications will need to know this for XML->XML
// transformation. It's also useful since we no
// longer have SDATA
public boolean external_entity_reference(String name, String value,
String type, String notation_name);
// The boolean return could be used to allow case-by-case
// decisions on whether or not to expand the entity in
line.
// This is the alternative to making it just a global
// option.
// If an XMLDocument gets a request to parse an unparsed
// external entity, it should create and invoke a new
parser
// with the options that it was originally created
with, and
// then resume traversing the new items (added to its
tree).
/* ... etc. ... */
}
Just a sketch of the kind of API that I'd like to integrate with.
-- David
_________________________________________
David Durand dgd@cs.bu.edu \ david@dynamicDiagrams.com
Boston University Computer Science \ Sr. Analyst
http://www.cs.bu.edu/students/grads/dgd/ \ Dynamic Diagrams
--------------------------------------------\ http://dynamicDiagrams.com/
MAPA: mapping for the WWW \__________________________
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From peter at techno.com Tue Mar 4 19:56:12 1997
From: peter at techno.com (Peter Newcomb)
Date: Mon Jun 7 16:57:28 2004
Subject: API thoughts...
In-Reply-To: (dgd@cs.bu.edu)
Message-ID: <199703041952.OAA11769@exocomp.techno.com>
> Date: Tue, 4 Mar 1997 12:44:09 -0500
> From: dgd@cs.bu.edu (David Durand)
>
> I see two main approaches to creating the data access methods:
>
> 1. to create a bunch of particular objects Element, Attribute, etc. and
> allow looking at them directly. This does make for a rather fat interface,
> and a lot of objects. In some contexts this is good (low object coupling),
> in others, bad (currently applets pay a high price for using many classes,
> and this will take at least a year to improve).
>
> 2. Create a general node object that can represent an element or
> attribute or entity, etc, and use a general protocol to explicity test and
> act on node types, and to traverse. This is essentially the grove model, as
> I understand it. The disadvantage is that it's not very concrete, and so
> it's harder to understand. You also lose the ability to use type-based
> dispatching if your programming style favors it -- you have to test the
> generic nodes yourself.
>
> Either of these models is good, but we need to examine the tradeoffs
> much more carefully and explicitly.
I think that it is important to define the data access API in such a
way as to facilitate the use of either or both models: If (1) is
implemented as a bunch of classes for element, attribute, etc., but
each of those is derived from a general node class as described in
(2), then the best of both worlds is available, since applications (or
language implementations) that cannot handle the overhead of (1) can
use just the generic node interface, and applications that can take
advantage of strong typing are not restricted to (2). Also, since the
two schemes would be interoperable at the generic node level,
different modules of the same application can use either model without
regard for what model other modules use.
-peter
--
Peter Newcomb TechnoTeacher, Inc.
233 Spruce Avenue P.O. Box 23795
Rochester, NY 14611-4041 USA Rochester, New York 14692-3795 USA
+1 716 464 8696 (home) +1 716 464 8696 (direct)
+1 716 755 8698 (cell) +1 716 271 0796 (main)
+1 716 529 4304 (fax) +1 716 271 0129 (fax)
peter@petes-house.rochester.ny.us peter@techno.com
http://www.petes-house.rochester.ny.us http://www.techno.com
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From gtn at ebt.com Tue Mar 4 20:06:50 1997
From: gtn at ebt.com (Gavin Nicol)
Date: Mon Jun 7 16:57:28 2004
Subject: API thoughts...
In-Reply-To: (dgd@cs.bu.edu)
Message-ID: <199703042004.PAA04417@nathaniel.ebt>
While I think we all tend toward similar designs I have a
problem with this:
> Parser(XMLBuilder builder, XMLOptions options);
> // here we also set the options
>
> public void parse(String url_start); // start parsing a document
I would rather pass the event *handler* into the parse() call,
and for that matter, I would probably be even happier if I
could also pass in the document to
parse.
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From gtn at ebt.com Tue Mar 4 20:36:57 1997
From: gtn at ebt.com (Gavin Nicol)
Date: Mon Jun 7 16:57:28 2004
Subject: Simple approaches to XML implementation
In-Reply-To: <199703041745.MAA11686@exocomp.techno.com> (message from Peter Newcomb on Tue, 4 Mar 1997 12:45:47 -0500)
Message-ID: <199703042033.PAA04428@nathaniel.ebt>
> First I want an entry point that allows the application to query
> the parser implementation what property set modules it supports
> (i.e., what's the richest grove plan available to users), and whether
> or not validation is available.
I think that this is all stuff that should occur *above* the parser.
Given an event-driver *parser* API, you can add validation
and grove building serices on top, for little, or not overhead
beyond what such system would normally incur.
Let's first focus on a *parser* API.
>Agreed. However I do not think that an API specification should
>dictate whether the grove is built from the event stream or the event
>stream from the grove; I would regard that as an implementation issue
>since some applications may choose to store documents as character
>streams and others as groves (or collections of objects similar to
>groves). The important thing is that both APIs (event stream and
>grove) be provided.
This amounts to reflective API's: ie. a grove can build and event
stream can build a grove. I have no problem with this as a general
*document interface* API, and it's exactly what I have built in my
various projects over the years.
However, this is fundamentally *different* to the *parser* API. What
is an XML parser? What does it consume? What does it produce?
>I'm not sure exactly what you mean by "mapping documents of various
>MIME types onto XML documents" though I would be interested to know.
Something like what I said before: given a certain level of
abstraction, syntax becomes irrelevant. XML would be just one of a
number of different syntaxes for the same underlying representation
(hey, anyone remember LISP?).
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From dgd at cs.bu.edu Wed Mar 5 04:47:51 1997
From: dgd at cs.bu.edu (David Durand)
Date: Mon Jun 7 16:57:29 2004
Subject: API thoughts...
In-Reply-To: <199703041952.OAA11769@exocomp.techno.com>
References: (dgd@cs.bu.edu)
Message-ID:
At 2:52 PM -0500 3/4/97, Peter Newcomb wrote:
>> From: dgd@cs.bu.edu (David Durand)
>>
>> I see two main approaches to creating the data access methods:
1 specific node types for each construct
2 generic node types, with special methods to test type and values.
>> Either of these models is good, but we need to examine the tradeoffs
>> much more carefully and explicitly.
>
>I think that it is important to define the data access API in such a
>way as to facilitate the use of either or both models: If (1) is
>implemented as a bunch of classes for element, attribute, etc., but
>each of those is derived from a general node class as described in
>(2), then the best of both worlds is available, since applications (or
>language implementations) that cannot handle the overhead of (1) can
>use just the generic node interface, and applications that can take
>advantage of strong typing are not restricted to (2). Also, since the
>two schemes would be interoperable at the generic node level,
>different modules of the same application can use either model without
>regard for what model other modules use.
Your description of the tradeoffs confuses me. It seems that a set of
specific classes corresponding to node types is the _easy_ solution for a
programmer. You need a little dynamic typing in the content of elements,
but you can let virtual methods do most of the work for you. With generic
elements you need to test explicitly for everything.
I can see that the "one-type" model may make some kinds of transformation
engine (mainly ones that use the grove model already) and low level
operations like serialization easier, but actually it seems that most
applications _need_ to do something different for an attribute than an
element, most of the time.
The practical reason to have one type currently is just time on the wire
for applets (if there are going to _be_ XML applets). One thing to thing
about is what interface would be easier for an applet author to deal with.
I can see applets that do custom rendering based on the markup of a subtree
of the document, and can get their input pre-parsed (even validated by a
DTD, if they want to cut some error code out). What interface makes
implementing such an applet easiest?
Anyhow, I don't yet have a very clear sense that the generic node
approach offers much more than a trimmed class tree. Am I missing something?
-- David
_________________________________________
David Durand dgd@cs.bu.edu \ david@dynamicDiagrams.com
Boston University Computer Science \ Sr. Analyst
http://www.cs.bu.edu/students/grads/dgd/ \ Dynamic Diagrams
--------------------------------------------\ http://dynamicDiagrams.com/
MAPA: mapping for the WWW \__________________________
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From dgd at cs.bu.edu Wed Mar 5 04:48:00 1997
From: dgd at cs.bu.edu (David Durand)
Date: Mon Jun 7 16:57:29 2004
Subject: Simple approaches to XML implementation
In-Reply-To: <199703042033.PAA04428@nathaniel.ebt>
References: <199703041745.MAA11686@exocomp.techno.com> (message from Peter
Newcomb on Tue, 4 Mar 1997 12:45:47 -0500)
Message-ID:
At 3:33 PM -0500 3/4/97, Gavin Nicol wrote:
>> First I want an entry point that allows the application to query
>> the parser implementation what property set modules it supports
>> (i.e., what's the richest grove plan available to users), and whether
>> or not validation is available.
>
>I think that this is all stuff that should occur *above* the parser.
>Given an event-driver *parser* API, you can add validation
>and grove building serices on top, for little, or not overhead
>beyond what such system would normally incur.
Yes, exactly. We may decide that some of these are so important that they
should be required from all implementations, and then we may not. But I
think it's easier and more sensible to define the event interface first.
Anyway, once we have an event-handler object, a structure holder can
generate events for it, just as easily as the event handler can create a
structure object.
>>Agreed. However I do not think that an API specification should
>>dictate whether the grove is built from the event stream or the event
>>stream from the grove; I would regard that as an implementation issue
>>since some applications may choose to store documents as character
>>streams and others as groves (or collections of objects similar to
>>groves). The important thing is that both APIs (event stream and
>>grove) be provided.
the event handler is an object with a lot of methods of the form:
handle_some_event(here's the data);
etc.
The grove object is the kind of thing that has methods like:
Type what_kind_of_node_am_i();
Node what's_my_ancestor();
String what's_my_name();
etc.
I think we should provide both interfaces, and for some language binding (I
like Java), we should implement one interface in terms of the other. But
this actually only really makes sense one way....
>This amounts to reflective API's: ie. a grove can build and event
>stream can build a grove. I have no problem with this as a general
>*document interface* API, and it's exactly what I have built in my
>various projects over the years.
And I hope we all see how trivial it would be to even provide an
implementation of both these operations.
>However, this is fundamentally *different* to the *parser* API. What
>is an XML parser? What does it consume? What does it produce?
Exactly. And here the event model is _in some sense_ the foundation.
Because parsers (even those that build trees automatically) typically
recognize events before they create structures. So if we want to maximize
the mix and match of our interfaces, we have a parser that sends events.
And a document representation (structure) that accepts queries.
We also provide _implementations_ of an event-handling object that builds a
document representation, and the representation has a method to accept an
event handler and pass it the events correspinding to the document's
structure -- and we're there.
>Something like what I said before: given a certain level of
>abstraction, syntax becomes irrelevant. XML would be just one of a
>number of different syntaxes for the same underlying representation
>(hey, anyone remember LISP?).
Just in case that's not clear, you can parse any old format and then decide
to send XML events to represent whatever it was that the parse tree of the
foreign format had in it... presto-change-o a GIF->XML translator is born!
(now the question is how to kill it off again...)
-- David
_________________________________________
David Durand dgd@cs.bu.edu \ david@dynamicDiagrams.com
Boston University Computer Science \ Sr. Analyst
http://www.cs.bu.edu/students/grads/dgd/ \ Dynamic Diagrams
--------------------------------------------\ http://dynamicDiagrams.com/
MAPA: mapping for the WWW \__________________________
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From nmikula at edu.uni-klu.ac.at Wed Mar 5 08:06:06 1997
From: nmikula at edu.uni-klu.ac.at (Norbert H. Mikula)
Date: Mon Jun 7 16:57:29 2004
Subject: API thoughts...
References:
Message-ID: <331DA7CE.ECF@edu.uni-klu.ac.at>
I just want to describe briefly how to current API of NXP is
designed.
There is a Java interface declaration that I have called ESIS.
Esis defines a set of callback routines which, I believe, is
pretty close to what Esis is supposed to deliver to an application.
I have designed this interface for my SGML parser Cappuccino.
This was done quite a while ago and I never really got feedback
on it. So it might not be complete.
Applications can make use of ESIS by implementing that interface.
One example, which is in the distribution, is to send the output
to stdout.
Yet another example, and I was using it that way with my SGML parser
Cappuccino, is to built a grove. This grove can then be traversed
and worked with. Also for the grove API there would need to be
some discussion. I think Alex Milowski has already done some work
on that.
The treebuiler/tree-traversal idea also sounds good to me.
Grove vs. Esis : I, as others, do believe that the grove builder can be
seen as a layer above the Esis interface. However, I guess we will need
to incrase the number of callbacks and hence probably find another name
for the interface.
For many applications creating a grove would probably be an overkill
solution.
--
Best regards,
Norbert H. Mikula
=====================================================
= SGML, DSSSL, Intra- & Internet, AI, Java
=====================================================
= mailto:nmikula@edu.uni-klu.ac.at
= http://www.edu.uni-klu.ac.at/~nmikula
=====================================================
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From Bill.Smith at Eng.Sun.COM Wed Mar 5 18:02:35 1997
From: Bill.Smith at Eng.Sun.COM (Bill Smith)
Date: Mon Jun 7 16:57:29 2004
Subject: API thoughts...
Message-ID:
A small point, but one I think important.
The term "callback" doesn't make much sense in Java since (if I remember
correctly) you can't pass function pointers in Java - there are no pointers.
If we are language-independent but object-oriented, callback is still the
wrong term.
Abstract class or interface would be more accurate.
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From gtn at ebt.com Wed Mar 5 19:55:32 1997
From: gtn at ebt.com (Gavin Nicol)
Date: Mon Jun 7 16:57:29 2004
Subject: API thoughts...
In-Reply-To: (message from Bill Smith on Wed, 5 Mar 1997 10:00:44 -0800 (PST))
Message-ID: <199703051952.OAA06206@nathaniel.ebt>
>The term "callback" doesn't make much sense in Java since (if I remember
>correctly) you can't pass function pointers in Java - there are no pointers.
>If we are language-independent but object-oriented, callback is still the
>wrong term.
>
>Abstract class or interface would be more accurate.
While I agree with eveything, I should note that you can call
a method by name in JAVA (ie. you can fake callbacks).
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From tbray at textuality.com Wed Mar 5 20:27:22 1997
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun 7 16:57:29 2004
Subject: API thoughts...
Message-ID: <3.0.32.19970305122554.007d55c0@pop.intergate.bc.ca>
At 10:00 AM 3/5/97 -0800, Bill Smith wrote:
>The term "callback" doesn't make much sense in Java since (if I remember
>correctly) you can't pass function pointers in Java
Well, the Lark event-stream API sure looks & feels like a bunch of
callbacks. You make a Lark object, call its readXML() method with one
argument being a Handler object; Handler being a data-less class that
just has a bunch of methods called things like doPI() and doStartTag() and
doEntityReference() and doText() and so on; you'd normally subclass Handler
replacing the methods for the events you wanted to see, and pass in
that kind of object. Lark calls these upon recognizing
the constructs in the input stream, passing the byte offset info, the
element & entity stack (*if* you're treebuilding), and other currently
relevant info. These methods are all booleans; if any returns true,
Lark stops and returns control to whoever called readXML().
Surely the GUI experience has taught us by now that a callback interface
is the way to go... anyone remember [shudder] XNextEvent()?
I am somewhat amused by all the Java propagandists saying "Java is so
much safer because we don't have pointers"... of course most variables
are in fact object pointers, and every object is in fact an Object, and
every array is in fact an Object, and you sure can wreak some good
old-fashioned C-style destruction on yourself when you accidentally
treat a pointer to a "byte[] foo" (oops, an object not a pointer)
as an oops-an-object-not-a-pointer-to-a "char foo[]". Still,
java is appealingly clean.
Note for XML developers... I just finished putting correct attribute
defaulting (internal subset decls only, sorry) into Lark (new version
soon) - it nearly doubled the number of parsing states and class file
sizes... sigh.
On another subject, I really have trouble with trying to pretend
that Element and Attribute and Entity and so on are just flavors
of some abstract Node thingie - the idea of having separate
classes/objects for these things just feels natural at a really deep
level. One of the *nice* things about SGML and XML is that even
if the markup is complicated, the number of underlying objects is
pretty limited and maps neatly into a class framework. - Tim
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From Bill.Smith at Eng.Sun.COM Wed Mar 5 22:16:17 1997
From: Bill.Smith at Eng.Sun.COM (Bill Smith)
Date: Mon Jun 7 16:57:29 2004
Subject: API thoughts...
Message-ID:
> Well, the Lark event-stream API sure looks & feels like a bunch of
> callbacks. You make a Lark object, call its readXML() method with one
> argument being a Handler object; Handler being a data-less class that
> just has a bunch of methods called things like doPI() and doStartTag() and
> doEntityReference() and doText() and so on; you'd normally subclass Handler
> replacing the methods for the events you wanted to see, and pass in
> that kind of object. Lark calls these upon recognizing
> the constructs in the input stream, passing the byte offset info, the
> element & entity stack (*if* you're treebuilding), and other currently
> relevant info. These methods are all booleans; if any returns true,
> Lark stops and returns control to whoever called readXML().
Another way to do this is to have the Lark object (or interface) define the
event methods rather than have a separate Handler object. When it's time to
parse something, create a subclass that overrides the (standard) event methods
for the Lark object.
A possible advantage to this method is that it makes clear the inheritance
relationship between the "standard" parser and something more specific. It
is also "easier" to create a more specific parser from an exisiting parser
object - simply subclass the existing parser and override the methods
required to provide the desired new functionality.
The Lark model "hides" the inheritance relationship in the Handler object
making it necessary to look inside a Lark object to determine the type of
a given parser (something you might need to do when debugging). An
alternative is to create a new parser object that contains a subclassed event
handler. This makes it possible to distinguish the type of parser at the
"outer" level but requires two new objects instead of one to perform the
subclass.
I'm not a parser expert so the subclass model may not make any sense but this
is a mechanism I have successfully used building other object-oriented
(including GUI-based) systems. I have also used callbacks but find them most
useful when forced to use C or other non-object-based languages.
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From bosak at atlantic-83.Eng.Sun.COM Thu Mar 6 00:19:01 1997
From: bosak at atlantic-83.Eng.Sun.COM (Jon Bosak)
Date: Mon Jun 7 16:57:29 2004
Subject: Associating DSSSL style sheets with documents
Message-ID: <199703060017.QAA00505@boethius.eng.sun.com>
Within the next week or so, I expect to announce the availability of a
Web server that can respond to certain kinds of URLs with an XML data
stream (eventually, a variety of XML data streams). For our own
design purposes, and also for the purposes of experimenters working
with combined XML/DSSSL applications, I would like to see this group
come up with an unofficial method or methods by which to associate a
DSSSL style sheet with a particular chunk of XML. Such methods would
be far in advance of the sgml-wg specification effort and subject to
later revision, but given the influence of experimental
implementations, I think that it's appropriate to put a little bit of
thought into the design.
One possible method suggested by James Clark (thank you, James) is to
adopt the convention used by Jade in the absence of the -d option:
replace the extension of the document entity's URL or file name with
.dsl and fetch that. Thus, if a browser fetches
http://docs.sun.com/foo/bar.html
then it should also look for
http://docs.sun.com/foo/bar.dsl
and apply it to bar.html if found.
This is appealingly straightforward, but I wonder how well it
accommodates multiple stylesheets and stylesheets that use other
notations (CSS, for example). Of course, we could deal with the
second concern by saying that DSSSL is the default stylesheet language
for XML experimentation and that we will figure out some way to
accommodate other stylesheet languages later.
James lists some other possibilities:
| - a processing instruction somewhere in the prolog
|
| - a catalog entry that says unconditionally to use some DSSSL style
| sheet
|
| - a catalog entry that associates a DSSSL style sheet with the public
| identifier of a DTD
|
| - make the document serve also as a style sheet by making it conform
| to the DSSSL architecture (this will work with Jade too)
Any thoughts on this? I am, of course, particularly interested in
hearing from those of you who are actively building DSSSL
applications.
Jon
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From gtn at ebt.com Thu Mar 6 00:40:23 1997
From: gtn at ebt.com (Gavin Nicol)
Date: Mon Jun 7 16:57:29 2004
Subject: Associating DSSSL style sheets with documents
In-Reply-To: <199703060017.QAA00505@boethius.eng.sun.com> (bosak@atlantic-83.Eng.Sun.COM)
Message-ID: <199703060037.TAA06393@nathaniel.ebt>
| - a catalog entry that says unconditionally to use some DSSSL style
| sheet
|
| - a catalog entry that associates a DSSSL style sheet with the public
| identifier of a DTD
When MIME-SGML was still doing something useful, a proposal to
add a SEMANTIC catalog entry was floated. This should be the
preferred method, I think.
(I've not looked at TR 9401 for some time, so it may have been
supplanted).
My next favourite would be to have some explicit link in the
document itself (either PI, or element).
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From cbullard at hiwaay.net Thu Mar 6 01:04:10 1997
From: cbullard at hiwaay.net (len bullard)
Date: Mon Jun 7 16:57:29 2004
Subject: Associating DSSSL style sheets with documents
References: <199703060037.TAA06393@nathaniel.ebt>
Message-ID: <331E17FA.7EA3@hiwaay.net>
Gavin Nicol wrote:
A catalog entry that associates per type is good, but it
does tie you to the DTD.
> My next favourite would be to have some explicit link in the
> document itself (either PI, or element).
We use that technique. It works but the user better keep up
with the piece parts per instance. What works better but is
even more hassle is to enable styles to be called from different
parts of the document when needed. This should be something
implementors can experiment with. A monolithic stylesheet
that handles multiple DTDs is also an option and works when
one is careful with the namespace. having styles that are
local to parts of the document are useful, as you know, when
one does not want to write a complex stylesheet for documents
that have lots of context conditions.
len
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From gtn at ebt.com Thu Mar 6 01:12:36 1997
From: gtn at ebt.com (Gavin Nicol)
Date: Mon Jun 7 16:57:30 2004
Subject: Associating DSSSL style sheets with documents
In-Reply-To: <331E17FA.7EA3@hiwaay.net> (message from len bullard on Wed, 05 Mar 1997 19:03:54 -0600)
Message-ID: <199703060110.UAA06491@nathaniel.ebt>
>A catalog entry that associates per type is good, but it
>does tie you to the DTD.
What do you mean "per type"? In DynaText, we actually use something
like the proposal:
SEMANTICS "popup" "ebt-fulltext-stylesheet" "Pop-Up Graphics" "grphpop.v"
SEMANTICS "serif" "ebt-fulltext-stylesheet" "Serif Font" "serif.v"
>Having styles that are local to parts of the document are useful, as
>you know, when one does not want to write a complex stylesheet for
>documents that have lots of context conditions.
Yes. Multiple stylesheet could be easier than styles qualified by
context in some cases. It really amounts to the same thing though
the binding mechnism is different....
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From edz at bsn.com Thu Mar 6 03:09:30 1997
From: edz at bsn.com (Edward C. Zimmermann)
Date: Mon Jun 7 16:57:30 2004
Subject: Associating DSSSL style sheets with documents
In-Reply-To: <199703060017.QAA00505@boethius.eng.sun.com> from "Jon Bosak" at Mar 5, 97 04:17:45 pm
Message-ID: <199703060310.EAA06638@hampton.bsn.com>
>
> Within the next week or so, I expect to announce the availability of a
> Web server that can respond to certain kinds of URLs with an XML data
> stream (eventually, a variety of XML data streams). For our own
> design purposes, and also for the purposes of experimenters working
> with combined XML/DSSSL applications, I would like to see this group
> come up with an unofficial method or methods by which to associate a
> DSSSL style sheet with a particular chunk of XML. Such methods would
> be far in advance of the sgml-wg specification effort and subject to
> later revision, but given the influence of experimental
> implementations, I think that it's appropriate to put a little bit of
> thought into the design.
>
> One possible method suggested by James Clark (thank you, James) is to
> adopt the convention used by Jade in the absence of the -d option:
> replace the extension of the document entity's URL or file name with
> .dsl and fetch that. Thus, if a browser fetches
>
> http://docs.sun.com/foo/bar.html
>
> then it should also look for
>
> http://docs.sun.com/foo/bar.dsl
Since no public DTDs must have the DTD, viz a URL to DTD.. and
from the name/path/URL to DTD then one can use the extension
.dsl or whatever for the DSSSL.
The problem with using .dsl as the map from the URL .extension is
that if one has a class of documents built around a DTD and that
has a DSSSL "style sheet" then one will either need to have a
front-end server to manage this whole bit (why) or fill the place
with symbollic links.. The problem with both are that proxy caches
will get filled with redundant bits... Why not use the DTD URL as
base? Either this or I don't understand what your aims are, or
I'm totally lost:-)
The other alternative, of course, would be to extend HTTP to
return a request for the association of a URL to the .dsl from
a file.. That is probably the better and more flexible way but
it won't work with popular off-the-shelf browser and thus is
ill-suited to experiments......
>
> and apply it to bar.html if found.
>
> This is appealingly straightforward, but I wonder how well it
> accommodates multiple stylesheets and stylesheets that use other
> notations (CSS, for example). Of course, we could deal with the
> second concern by saying that DSSSL is the default stylesheet language
> for XML experimentation and that we will figure out some way to
> accommodate other stylesheet languages later.
>
> James lists some other possibilities:
>
> | - a processing instruction somewhere in the prolog
> |
> | - a catalog entry that says unconditionally to use some DSSSL style
> | sheet
> |
> | - a catalog entry that associates a DSSSL style sheet with the public
> | identifier of a DTD
> |
> | - make the document serve also as a style sheet by making it conform
> | to the DSSSL architecture (this will work with Jade too)
>
> Any thoughts on this? I am, of course, particularly interested in
> hearing from those of you who are actively building DSSSL
> applications.
>
> Jon
>
>
> xml-dev: A list for W3C XML Developers
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
> To unsubscribe, send to majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
>
>
--
______________________
Edward C. Zimmermann
Basis Systeme netzwerk/Munich
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From cbullard at hiwaay.net Thu Mar 6 03:32:02 1997
From: cbullard at hiwaay.net (len bullard)
Date: Mon Jun 7 16:57:30 2004
Subject: Associating DSSSL style sheets with documents
References: <199703060110.UAA06491@nathaniel.ebt>
Message-ID: <331E3AAC.1A9F@hiwaay.net>
Gavin Nicol wrote:
>
> >A catalog entry that associates per type is good, but it
> >does tie you to the DTD.
>
> What do you mean "per type"? In DynaText, we actually use something
> like the proposal:
>
> SEMANTICS "popup" "ebt-fulltext-stylesheet" "Pop-Up Graphics" "grphpop.v"
> SEMANTICS "serif" "ebt-fulltext-stylesheet" "Serif Font" "serif.v"
I thought you meant, per DTD.
> >Having styles that are local to parts of the document are useful, as
> >you know, when one does not want to write a complex stylesheet for
> >documents that have lots of context conditions.
>
> Yes. Multiple stylesheet could be easier than styles qualified by
> context in some cases. It really amounts to the same thing though
> the binding mechnism is different....
Yes and no. The problem with the FOSI was that even though it
worked, it was hard to specify style on elements in context
when the contexts were complex. We combine context and
local stylesheets. So, a parentage can be used, but a local
stylesheet can introduce a new one, so the complexity is
localized as well. Conservation of complexity: we have
more stylesheets to manage per instance.
len
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From jjc at jclark.com Thu Mar 6 03:34:36 1997
From: jjc at jclark.com (James Clark)
Date: Mon Jun 7 16:57:30 2004
Subject: Associating DSSSL style sheets with documents
Message-ID: <2.2.32.19970306032432.00a75998@jclark.com>
At 19:37 05/03/97 -0500, Gavin Nicol wrote:
>| - a catalog entry that says unconditionally to use some DSSSL style
>| sheet
>|
>| - a catalog entry that associates a DSSSL style sheet with the public
>| identifier of a DTD
>
>When MIME-SGML was still doing something useful, a proposal to
>add a SEMANTIC catalog entry was floated. This should be the
>preferred method, I think.
As far as I remember the SEMANTIC catalog entry proposal had several arguments:
a) the type of processing spec (DSSSL or EBT style sheets)
b) the system identifier for the processing spec
c) some sort of description you could display in a menu
I think there was something else, but I don't remember.
Requiring (c) is not apropriate for DSSSL, since DSSSL specification
documents can contain multiple separate stylesheets each with their own
description, which is specified inside the DSSSL specification document (the
DESC attribute on the style-specification form). This seems like a general
problem: different kinds of processing specification may require different
sets of arguments to invoke them.
Since vendors and users can add their own SGML Open entry types, I see no
advantage to having a general SEMANTIC entry with a type attribute, rather
than a separate entry type for each type of processing spec.
So if we are going to use a catalog entry (and I'm not yet convinced this is
the best solution) I would suggest having a simple DSSSL entry which looks like:
DSSSL spec.dsl
One complication is that a DSSSL spec is itself an SGML document, which may
require a different catalog for parsing. This probably doesn't matter in
the context of XML, but it does in SGML: the DSSSL spec may well need a
different implied SGML declaration. I'm not sure what the best way to
handle this is.
Of course this doesn't completely solve the problem: we now have to figure
out how to associate a catalog with an SGML document. The latest SGML Open
draft requires (amongst other things) trying the URL/filename of the
document with any extension replaced by .soc.
James
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From jjc at jclark.com Thu Mar 6 03:34:46 1997
From: jjc at jclark.com (James Clark)
Date: Mon Jun 7 16:57:30 2004
Subject: Associating DSSSL style sheets with documents
Message-ID: <2.2.32.19970306032436.00aa635c@jclark.com>
At 16:17 05/03/97 -0800, Jon Bosak wrote:
>One possible method suggested by James Clark (thank you, James) is to
>adopt the convention used by Jade in the absence of the -d option:
>replace the extension of the document entity's URL or file name with
>.dsl and fetch that. Thus, if a browser fetches
>
> http://docs.sun.com/foo/bar.html
>
>then it should also look for
>
> http://docs.sun.com/foo/bar.dsl
>
>and apply it to bar.html if found.
>
>This is appealingly straightforward, but I wonder how well it
>accommodates multiple stylesheets
A DSSSL specification document can contain any number of distinct style
specifications: it can also contain links to other DSSSL specification
documents.
>and stylesheets that use other
>notations (CSS, for example).
Use another extension.
>James lists some other possibilities:
>
>| - a processing instruction somewhere in the prolog
>|
>| - a catalog entry that says unconditionally to use some DSSSL style
>| sheet
>|
>| - a catalog entry that associates a DSSSL style sheet with the public
>| identifier of a DTD
>|
>| - make the document serve also as a style sheet by making it conform
>| to the DSSSL architecture (this will work with Jade too)
Another possibility I forgot to mention is to have a parameter on the
Content-Type header field:
Content-Type: text/xml; stylesheet=foo.dsl
This is only going to work in the context of HTTP. The type of the
stylesheet could be indicated by its Content-Type, and the client could use
content-type negotiation to ensure it gets the kind of stylesheet it can
handle. Somebody should probably register a MIME content-type for DSSSL
style sheets as well.
James
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From housel at ms7.hinet.net Thu Mar 6 04:34:54 1997
From: housel at ms7.hinet.net (Peter S. Housel)
Date: Mon Jun 7 16:57:30 2004
Subject: Simple approaches to XML implementation
Message-ID: <199703060427.MAA29885@ms7.hinet.net>
Gavin Nicol (gtn@ebt.com) wrote:
>I would tend toward an event-driven interface, and an
>option-setting interface as the core parser API. For example:
>
>class XMLEventHandler {
>public boolean OnComment(String comment);
>public boolean OnElementStart(...)
>....
>}
>
>class XMLParser {
>...
>parser(XMLEventHandler handler);
>...
>}
That's one way of doing things. The main problem I see with this interface
is that there are quite a few possible methods (I count 71 classdefs in
the SGML property set, though of course not all of those are applicable to
XML), and it becomes difficult to expand the set of events.
There's also the issue of "who's in charge?" This is actually a tough
issue. I like the way P.J. Plaugher put it in Programming on Purpose:
when you're designing the program's architecture, first you draw a graph
of nodes, with arrows showing the flow of information from subsystem to
subsystem. Then you grab a node and shake the graph. What you get is
your call graph, with the main processing loop located in the node you
shook, making requests to the other subsystems.
As much as possible, a good reusable component should not force the
user's hand when choosing what node to grab onto. As an example,
YACC is pretty bad about this. You supply it with a lexer (with a
fixed name) and a set of handlers to be called when productions are
reduced. The YACC-generated parser insists on being in charge.
If all of today's popular languages had coroutines, we wouldn't have
this problem. Every component could be written as if it were in
charge. Unfortunately, most languages don't have a portable coroutine
facility.
For an XML document parsing system, the components we need to
consider are:
1. An external entity manager, responsible for obtaining document
instances (the "start" document and others), DTD's, etc. from
local storage, the web, some database, etc. This should probably
be user-customizable.
2. An encoding manager, responsible for mapping one of the possible
XML document encodings (Latin-n, UTF-7, UTF-8, UCS-2, UTF-16, whatever)
onto ISO10646 characters.
3. The parser itself, responsible for turning characters into XML events,
and possibly into grove structures.
4. The user's application.
As far as I can see, we have the following scenarios:
* [Browser] If you're building a web browser, you want the network
interrupt
to be in charge. That is, when a packet's worth of document/DTD/whatever
data comes in from the net, the parser should use that to parse as much
of
the document as it can, and pass as many events on to the application as
possible. This gives optimal user response, provided you don't need the
whole document to start displaying it. The external entity manager would
have a callback for requesting additional external entities, that would
add
the request to an internal queue and return immediately to the parser.
In this architecture, the user would create a parser object by specifying
an external entity manager callback, a set of parser options (grove plan,
validate or not, etc.), and an XMLEventHandler like the one shown above.
Then your external entity manager would send a message to the parser
object
giving it a buffer full of bytes and an indication of which entity they
belong to.
* [YACC] You may want the parser to be in charge, like YACC. In this case
you would call the parser, specifying the external event manager object
(written using the Strategy pattern), list of options, and an
XMLEventHandler object (which corresponds to the Builder pattern).
* [XMLEventStream] You want some part or another of your application to be
in charge, and you want a stream of XMLEvent objects. In this case, you
create a parser object (XMLEventStream), specifying an external entity
manager
object, a start document, and a list of options. You send a message to
this
object whenever you want another event from the stream.
* [Grove] You want to access nodes in a grove. So, you pass in your
start document, your start document, and your options, and you get a root
node back. The parser might construct the whole grove, or do it lazily
when
you ask for a property that hasn't been computed yet.
These scenarios assume that the document(s) are stored in ordinary files or
on the web. As Peter Newcombe pointed out, another scenario is when the
document is stored in a database, possibly in grove form. In this case
being able to specify an entity manager probably isn't desirable, and the
[Browser] scenario probably doesn't fit at all.
So, which of these scenarios do we want to specify for an XML API? Should
all of them be? Should [Browser] be one of the ones included?
[Browser] gives the most complicated parser, since it has to asynchronously
handle information from several different documents.
[YACC] is the easiest to write, but it's less flexible. Given [Browser],
it's easy to write [YACC]. (Given [XMLEventStream] you can also derive
[YACC], but with greater overhead.)
[XMLEventStream] and [Grove] give you the most flexibility with respect to
the grove plan.
Hope this helps to clarity the issues a little. I've been thinking about
this for awhile, in the context of reusable parser components for
programming
languages, but the only firm conclusion I've come to is that I really wish
I could use coroutines.
-Peter-
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From bosak at atlantic-83.Eng.Sun.COM Thu Mar 6 05:02:26 1997
From: bosak at atlantic-83.Eng.Sun.COM (Jon Bosak)
Date: Mon Jun 7 16:57:30 2004
Subject: Associating DSSSL style sheets with documents
In-Reply-To: <199703060310.EAA06638@hampton.bsn.com> (edz@bsn.com)
Message-ID: <199703060501.VAA00624@boethius.eng.sun.com>
[Edward C. Zimmermann:]
| The problem with using .dsl as the map from the URL .extension is that
| if one has a class of documents built around a DTD and that has a
| DSSSL "style sheet" then one will either need to have a front-end
| server to manage this whole bit (why) or fill the place with symbollic
| links.. The problem with both are that proxy caches will get filled
| with redundant bits... Why not use the DTD URL as base?
Although we're in the habit of thinking this way, there is in fact no
one-to-one correspondence between stylesheets and DTDs. It is
possible to write a catch-all stylesheet that will work with documents
written to a number of DTDs, and conversely it's not only possible but
often desirable to create a number of stylesheets that are all
designed to work with documents written to a single DTD.
And then there's the fact that DTDs are optional in XML...
Jon
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From bosak at atlantic-83.Eng.Sun.COM Thu Mar 6 05:21:05 1997
From: bosak at atlantic-83.Eng.Sun.COM (Jon Bosak)
Date: Mon Jun 7 16:57:30 2004
Subject: Associating DSSSL style sheets with documents
In-Reply-To: <2.2.32.19970306032436.00aa635c@jclark.com> (message from James Clark on Thu, 06 Mar 1997 10:24:36 +0700)
Message-ID: <199703060519.VAA00633@boethius.eng.sun.com>
[James Clark:]
| >This is appealingly straightforward, but I wonder how well it
| >accommodates multiple stylesheets
|
| A DSSSL specification document can contain any number of distinct style
| specifications: it can also contain links to other DSSSL specification
| documents.
That's what I thought, but I haven't tried it yet.
| >and stylesheets that use other
| >notations (CSS, for example).
|
| Use another extension.
Yes, but then how do you determine precedence if both foo.dsl and
foo.css are found? That's why I said that a way out (admittedly not a
very good one) would be to default to DSSSL for the time being.
| Another possibility I forgot to mention is to have a parameter on the
| Content-Type header field:
|
| Content-Type: text/xml; stylesheet=foo.dsl
|
| This is only going to work in the context of HTTP.
Well, HTTP delivery is what I'm trying to get set up.
| The type of the stylesheet could be indicated by its Content-Type, and
| the client could use content-type negotiation to ensure it gets the
| kind of stylesheet it can handle. Somebody should probably register a
| MIME content-type for DSSSL style sheets as well.
Both the xml and dsssl content-type registrations are hanging right
now for lack of time to deal with the IANA paperwork. I'd be glad to
hand these off to any individual or ad hoc working group with
experience in MIME type registration.
Jon
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From jjc at jclark.com Thu Mar 6 06:28:34 1997
From: jjc at jclark.com (James Clark)
Date: Mon Jun 7 16:57:30 2004
Subject: Associating DSSSL style sheets with documents
Message-ID: <2.2.32.19970306061830.00a94b70@jclark.com>
At 21:19 05/03/97 -0800, Jon Bosak wrote:
>| >and stylesheets that use other
>| >notations (CSS, for example).
>|
>| Use another extension.
>
>Yes, but then how do you determine precedence if both foo.dsl and
>foo.css are found?
That's only a problem if there's both DSSSL and CSS style sheets available
and the client can handle both DSSSL and CSS. In that case, I would leave
it up to the client to choose which it prefers. The content provider isn't
really in a position to make that decision: if the client has a very
complete CSS implementation but only a rather limited DSSSL implementation,
the CSS style sheet may be preferable; but if the client has a complete
DSSSL implementation, the DSSSL style sheet may be preferable.
You're going to have this issue whatever mechanism you use for associating
style sheets.
James
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From nmikula at edu.uni-klu.ac.at Thu Mar 6 08:03:20 1997
From: nmikula at edu.uni-klu.ac.at (Norbert H. Mikula)
Date: Mon Jun 7 16:57:30 2004
Subject: Which style first ? Re: Associating DSSSL style sheets with documents
References: <199703060017.QAA00505@boethius.eng.sun.com>
Message-ID: <331EF2DA.4165@edu.uni-klu.ac.at>
If we talk about the problems of associations, we should also
talk about the problem of what style, out of a list of style-specs.,
a DSSSL engine is supposed to pick out.
DSSSL supports the definition of multiple style-sheets in
one style-document. I do believe that this is a very important
feature and we need to discuss how to employ it to the fullest.
If I remember correctly, Jade picks out the first it found,
and, I have to admit I don't have the latest Jade version
installed, doesn't do anything with the others.
My DSSSL engine, YADE, reads all the specs. and then picks
out the first one that is has read and does the first
rendering with it. All the other styles are put in a
list which the user, via a menu, can choose from.
I see the following problem. I am working on a DSSSL document
that has one style for hardcopy and a variety of styles for
online rendering. How are we going to tell the DSSSL engine
what style to start with.
The document/style author does not know what kind of DSSSL engine
is going to work with his document. Thus the stylespec. itself
must provide means to communicate to the DSSSL engine
what kind of DSSSL engine i.e. online, hardcopy etc. a style
is suitable for.
Any ideas ?
What about adding an attribute with a list of catagories
of DSSSL engines as possible attribute values.
For instance : output (hardcopy|online|....?) hardcopy
--
Best regards,
Norbert H. Mikula
=====================================================
= SGML, DSSSL, Intra- & Internet, AI, Java
=====================================================
= mailto:nmikula@edu.uni-klu.ac.at
= http://www.edu.uni-klu.ac.at/~nmikula
=====================================================
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From nmikula at edu.uni-klu.ac.at Thu Mar 6 08:03:36 1997
From: nmikula at edu.uni-klu.ac.at (Norbert H. Mikula)
Date: Mon Jun 7 16:57:30 2004
Subject: Associating DSSSL style sheets with documents
References: <199703060017.QAA00505@boethius.eng.sun.com>
Message-ID: <331EED8D.7312@edu.uni-klu.ac.at>
Jon Bosak wrote:
> One possible method suggested by James Clark (thank you, James) is to
> adopt the convention used by Jade in the absence of the -d option:
> replace the extension of the document entity's URL or file name with
> .dsl and fetch that. Thus, if a browser fetches
>
> http://docs.sun.com/foo/bar.html
>
> then it should also look for
>
> http://docs.sun.com/foo/bar.dsl
>
> and apply it to bar.html if found.
That is the way I have been doing it with my Cappuccino/Yade/PSC_EDB
system.
My NXP/Yade/PSC_EDB system, at least for now, will also do it like this.
If you have only a few document instances it works fine. If you have
hundrets
of them you probably will get into troubles (for the reason already
pointed out by others)
> James lists some other possibilities:
>
> | - a processing instruction somewhere in the prolog
I think, at least for XML, we don't want to use PIs too often.
> | - a catalog entry that says unconditionally to use some DSSSL style
> | sheet
Maybe as a fallbak if other asscociation mechanisms failed....
> | - a catalog entry that associates a DSSSL style sheet with the public
> | identifier of a DTD
Hmm, I think I would like that.
> | - make the document serve also as a style sheet by making it conform
> | to the DSSSL architecture (this will work with Jade too)
Do I understand correct that you want to include the style into the
actual document instance ? Would work of course, but I guess only
if we assume that DSSSL is the only style-spec. mechanism.
Of course I would support that ;-) No, to be serious, I think
we should separate style and instance as much as we can.
------
I do also think the idea with the mime header is worth a try.
--
Best regards,
Norbert H. Mikula
=====================================================
= SGML, DSSSL, Intra- & Internet, AI, Java
=====================================================
= mailto:nmikula@edu.uni-klu.ac.at
= http://www.edu.uni-klu.ac.at/~nmikula
=====================================================
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From tms at ansa.co.uk Thu Mar 6 11:00:56 1997
From: tms at ansa.co.uk (Toby Speight)
Date: Mon Jun 7 16:57:30 2004
Subject: Associating DSSSL style sheets with documents
In-Reply-To: bosak@atlantic-83.Eng.Sun.COM's message of Wed, 5 Mar 1997 16:17:45 -0800
References: <199703060017.QAA00505@boethius.eng.sun.com>
Message-ID:
A non-text attachment was scrubbed...
Name: not available
Type: text/plain (pgp signed)
Size: 2191 bytes
Desc: not available
Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19970306/0b73020c/attachment.bin
From tms at ansa.co.uk Thu Mar 6 11:20:40 1997
From: tms at ansa.co.uk (Toby Speight)
Date: Mon Jun 7 16:57:30 2004
Subject: Associating DSSSL style sheets with documents
In-Reply-To: Toby Speight's message of 06 Mar 1997 10:59:48 +0000
References: <199703060017.QAA00505@boethius.eng.sun.com>
Message-ID:
A non-text attachment was scrubbed...
Name: not available
Type: text/plain (pgp signed)
Size: 956 bytes
Desc: not available
Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19970306/5622930a/attachment.bin
From jjc at jclark.com Thu Mar 6 12:38:09 1997
From: jjc at jclark.com (James Clark)
Date: Mon Jun 7 16:57:30 2004
Subject: Which style first ? Re: Associating DSSSL style sheets
with documents
Message-ID: <2.2.32.19970306122748.00ad4c24@jclark.com>
At 08:37 06/03/97 -0800, Norbert H. Mikula wrote:
>DSSSL supports the definition of multiple style-sheets in
>one style-document. I do believe that this is a very important
>feature and we need to discuss how to employ it to the fullest.
>
>If I remember correctly, Jade picks out the first it found,
>and, I have to admit I don't have the latest Jade version
>installed, doesn't do anything with the others.
Actually if you use -d style.dsl#hardcopy, it will use the spec called
hardcopy in the style.dsl document. (I forgot to mention this in the docs.)
>My DSSSL engine, YADE, reads all the specs. and then picks
>out the first one that is has read and does the first
>rendering with it. All the other styles are put in a
>list which the user, via a menu, can choose from.
>
>I see the following problem. I am working on a DSSSL document
>that has one style for hardcopy and a variety of styles for
>online rendering. How are we going to tell the DSSSL engine
>what style to start with.
Where the content provider knows which style they want, they can use a
fragment spec in the URL to pick out a particular spec from the document.
>The document/style author does not know what kind of DSSSL engine
>is going to work with his document. Thus the stylespec. itself
>must provide means to communicate to the DSSSL engine
>what kind of DSSSL engine i.e. online, hardcopy etc. a style
>is suitable for.
>
>Any ideas ?
>
>What about adding an attribute with a list of catagories
>of DSSSL engines as possible attribute values.
>
>For instance : output (hardcopy|online|....?) hardcopy
What other categories are there? If a style sheet is for online use, then
it has to use the scroll flow object, which means it ought to list the
online feature in the features element type form. Maybe a DSSSL engine
could use this, or maybe it could look to see which flow object classes the
spec uses.
James
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From jjc at jclark.com Thu Mar 6 12:39:05 1997
From: jjc at jclark.com (James Clark)
Date: Mon Jun 7 16:57:30 2004
Subject: Associating DSSSL style sheets with documents
Message-ID: <2.2.32.19970306122841.009c9094@jclark.com>
At 08:15 06/03/97 -0800, Norbert H. Mikula wrote:
>> | - a processing instruction somewhere in the prolog
>
>I think, at least for XML, we don't want to use PIs too often.
In general I agree. But when you specify a style sheet, you surely are
giving an instruction about the processing of the document. If PIs aren't
appropriate for this, what are they good for? Since we have them, why not
use them?
If we allow the PI to occur anywhere in the prolog, then if a user has a
style sheet for some DTD, they can simply add this PI to the DTD, and all
documents conforming to the DTD will automatically use the style sheet.
If you have an SGML system that supports FSIs, you could even have something
like:
PUBLIC "-//...//DTD Docbook//EN" "docbook.dtd"
and associate the DTD with the stylesheet without changing the DTD.
Other advantages:
- it doesn't require a catalog, so you don't have the problem of finding that;
- it works even for dynamic XML (eg generated on the fly in response to a
query);
- it works both locally and over HTTP.
>> | - a catalog entry that associates a DSSSL style sheet with the public
>> | identifier of a DTD
>
>Hmm, I think I would like that.
I think this is handy for simple not very customizable DTDs (eg HTML). But I
don't think it's enough just to key of the public id in the doctype
declaration, because sometimes you need to add declarations *after the DTD*,
which means you have to reference the DTD with an entity declaration rather
than in the doctype declaration, eg
%dtd;
]>
I think you need to have a scheme that considers the public ids of all
external parameter entities referenced in the DTD.
I don't think any single method is adequate by itself. I think we need two
or three methods that complement each other. I would also like to have
something that will work equally for SGML and XML.
James
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From housel at ms7.hinet.net Thu Mar 6 12:50:32 1997
From: housel at ms7.hinet.net (Peter S. Housel)
Date: Mon Jun 7 16:57:30 2004
Subject: Simple approaches to XML implementation
Message-ID: <199703061242.UAA29908@ms7.hinet.net>
One more scenario:
* [YACC+] is like [YACC], except that it returns control
before parsing the whole instance. For instance, it
might parse an element and then pass on events for element-start
and all of the attributes, then return control.
NXP uses the [YACC] scenario.
Lark is a flexible version of [YACC+] that allows the
handlers to determine on an event-by-event basis when
the parser should return. [Grove] and [XMLEventStream]
could be built in Lark, as could [YACC], but [Browser]
could not.
-Peter S. Housel- housel@ms7.hinet.net
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From gtn at ebt.com Thu Mar 6 13:39:37 1997
From: gtn at ebt.com (Gavin Nicol)
Date: Mon Jun 7 16:57:30 2004
Subject: Associating DSSSL style sheets with documents
In-Reply-To: <331E3AAC.1A9F@hiwaay.net> (message from len bullard on Wed, 05 Mar 1997 21:31:56 -0600)
Message-ID: <199703061336.IAA06872@nathaniel.ebt>
>> Yes. Multiple stylesheet could be easier than styles qualified by
>> context in some cases. It really amounts to the same thing though
>> the binding mechnism is different....
>
>Yes and no. The problem with the FOSI was that even though it
>worked, it was hard to specify style on elements in context
>when the contexts were complex. We combine context and
>local stylesheets. So, a parentage can be used, but a local
>stylesheet can introduce a new one, so the complexity is
>localized as well. Conservation of complexity: we have
>more stylesheets to manage per instance.
Ahh. Trading complexity in specification for complexity in management.
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From gtn at ebt.com Thu Mar 6 13:42:34 1997
From: gtn at ebt.com (Gavin Nicol)
Date: Mon Jun 7 16:57:30 2004
Subject: Associating DSSSL style sheets with documents
In-Reply-To: <2.2.32.19970306032432.00a75998@jclark.com> (message from James Clark on Thu, 06 Mar 1997 10:24:32 +0700)
Message-ID: <199703061339.IAA06874@nathaniel.ebt>
>Of course this doesn't completely solve the problem: we now have to figure
>out how to associate a catalog with an SGML document. The latest SGML Open
>draft requires (amongst other things) trying the URL/filename of the
>document with any extension replaced by .soc.
Reverse the problem as the MIME-SGML proposal did: send the catalog
first, and all is well (we have this same discussion on XML-WG, and
I agree that catalogs are insufficient in the general sense, but
in the simple cases we're likely to see initially, they're fine).
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From gtn at ebt.com Thu Mar 6 13:44:03 1997
From: gtn at ebt.com (Gavin Nicol)
Date: Mon Jun 7 16:57:31 2004
Subject: Associating DSSSL style sheets with documents
In-Reply-To: <2.2.32.19970306032432.00a75998@jclark.com> (message from James Clark on Thu, 06 Mar 1997 10:24:32 +0700)
Message-ID: <199703061341.IAA06880@nathaniel.ebt>
>So if we are going to use a catalog entry (and I'm not yet convinced
>this is the best solution) I would suggest having a simple DSSSL entry
>which looks like:
>
>DSSSL spec.dsl
The problem with this is that it is application specific. How do I
tell a browser that DSSSL === DSSSL Stylesheet?
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From gtn at ebt.com Thu Mar 6 13:54:54 1997
From: gtn at ebt.com (Gavin Nicol)
Date: Mon Jun 7 16:57:31 2004
Subject: Simple approaches to XML implementation
In-Reply-To: <199703060427.MAA29885@ms7.hinet.net> (housel@ms7.hinet.net)
Message-ID: <199703061352.IAA06888@nathaniel.ebt>
>>class XMLParser {
>>...
>>parser(XMLEventHandler handler);
>>...
>>}
>
>That's one way of doing things. The main problem I see with this interface
>is that there are quite a few possible methods (I count 71 classdefs in
>the SGML property set, though of course not all of those are applicable to
>XML), and it becomes difficult to expand the set of events.
I use about 8 event handlers for most of my API's...
>As much as possible, a good reusable component should not force the
>user's hand when choosing what node to grab onto. As an example,
>YACC is pretty bad about this. You supply it with a lexer (with a
>fixed name) and a set of handlers to be called when productions are
>reduced. The YACC-generated parser insists on being in charge.
Sure. The important thing with is that if you want to query into
a document, you have to have parsed at least as far as the nodes you
want to access, and that haveing a tree representation for such cases
makes it a *lot* easier. For cases where you "want to be in control",
I would have the event handler be a grove constructor, and have the
application work upon the grove. Note that accessing a grove, or
querying a document is *different* to *parsing* a document.
>1. An external entity manager, responsible for obtaining document
> instances (the "start" document and others), DTD's, etc. from
> local storage, the web, some database, etc. This should probably
> be user-customizable.
I'm not sure about this. In some ways, I cannot see the reason for
*exposing* an entity manager, but then again, I can imagine an
implementation without one either....
>2. An encoding manager, responsible for mapping one of the possible
> XML document encodings (Latin-n, UTF-7, UTF-8, UCS-2, UTF-16, whatever)
> onto ISO10646 characters.
Streams...
>3. The parser itself, responsible for turning characters into XML events,
> and possibly into grove structures.
Push grove building off to later stages.
>[Browser] gives the most complicated parser, since it has to asynchronously
>handle information from several different documents.
>
>[YACC] is the easiest to write, but it's less flexible. Given [Browser],
>it's easy to write [YACC]. (Given [XMLEventStream] you can also derive
>[YACC], but with greater overhead.)
>
>[XMLEventStream] and [Grove] give you the most flexibility with respect to
>the grove plan.
I think these confluge many different processing layers.
>languages, but the only firm conclusion I've come to is that I really wish
>I could use coroutines.
Amen to that sentiment.
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From gtn at ebt.com Thu Mar 6 14:00:48 1997
From: gtn at ebt.com (Gavin Nicol)
Date: Mon Jun 7 16:57:31 2004
Subject: Associating DSSSL style sheets with documents
In-Reply-To: <199703060519.VAA00633@boethius.eng.sun.com> (bosak@atlantic-83.Eng.Sun.COM)
Message-ID: <199703061358.IAA06890@nathaniel.ebt>
| Content-Type: text/xml; stylesheet=foo.dsl
|
| This is only going to work in the context of HTTP.
This is not going to work if you have multiple stylesheets associated
with a document unless you use multipart MIME bodies, and then we're
right back to MIME-SGML. I favor Don's proposal because it had
the right semantics for both HTTP, amd email (ie. it could pull or
push).
| The type of the stylesheet could be indicated by its Content-Type, and
| the client could use content-type negotiation to ensure it gets the
| kind of stylesheet it can handle. Somebody should probably register a
| MIME content-type for DSSSL style sheets as well.
Content negotiation only takes you so far, and is HTTP specific *and*
spottily implemented. It also only allows you to negotiate on the
*type* of the resource, and a few other things. It does not help if
you have multiple resources each of which are of equivalent quality
*and* that the user may like to choose between.
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From gtn at ebt.com Thu Mar 6 14:09:34 1997
From: gtn at ebt.com (Gavin Nicol)
Date: Mon Jun 7 16:57:31 2004
Subject: Which style first ? Re: Associating DSSSL style sheets with documents
In-Reply-To: <331EF2DA.4165@edu.uni-klu.ac.at> (nmikula@edu.uni-klu.ac.at)
Message-ID: <199703061406.JAA06898@nathaniel.ebt>
>I see the following problem. I am working on a DSSSL document
>that has one style for hardcopy and a variety of styles for
>online rendering. How are we going to tell the DSSSL engine
>what style to start with.
Precisely what the SEMANTIC proposal for catalogs was supposed to
take care of. We use this to good effect in DynaText...
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From gtn at ebt.com Thu Mar 6 14:10:49 1997
From: gtn at ebt.com (Gavin Nicol)
Date: Mon Jun 7 16:57:31 2004
Subject: Associating DSSSL style sheets with documents
In-Reply-To: <331EED8D.7312@edu.uni-klu.ac.at> (nmikula@edu.uni-klu.ac.at)
Message-ID: <199703061408.JAA06900@nathaniel.ebt>
>I do also think the idea with the mime header is worth a try.
MIME is overkill for this, and in the context of HTTP, it
is severely limited. Forget it.
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From cbullard at hiwaay.net Thu Mar 6 15:21:34 1997
From: cbullard at hiwaay.net (Len Bullard)
Date: Mon Jun 7 16:57:31 2004
Subject: Associating DSSSL style sheets with documents
References: <199703061336.IAA06872@nathaniel.ebt>
Message-ID: <331EDE4D.166@hiwaay.net>
Gavin Nicol wrote:
>
> >Yes and no. The problem with the FOSI was that even though it
> >worked, it was hard to specify style on elements in context
> >when the contexts were complex. We combine context and
> >local stylesheets. So, a parentage can be used, but a local
> >stylesheet can introduce a new one, so the complexity is
> >localized as well. Conservation of complexity: we have
> >more stylesheets to manage per instance.
>
> Ahh. Trading complexity in specification for complexity in management.
Yes. Optional realities based on what requirement is most compelling
in a given production management scenario. One can write a stylesheet
for a complex
DTD and get a complex stylesheet. One can write a stylesheet for
a set of related DTDs and only have to write some exceptions.
One can write multiple stylesheets that are called at
different parts of the document as we do in IDE/AS and IADS
and the styles are flexible. The compromise is keeping and
managing multiple stylesheets per document class if one is
smart enough to use DTDs for systems that don't require them.
We've been through the "well-formed" approach down here.
Good for light stuff, but for classes of documents used
over lifecycles, nyet. But I'm content to let others bump
their heads against that problem until they understand it.
Back to stylesheets, ss in any compound class (one stylesheet,
several DTDs) that varies like this, if it is also dynamic
(that is, some part of the DTD is always changing), then
that complex/compound stylesheet can become a real bear. This
is particularly true in systems where one must share the stylesheet
across organizational boundaries. The longer one looks at this,
the more one starts to favor delivery of encapsulated objects
and view the separation of process and data as a weirdly religious
approach favored by those who do not manage large dynamic document
collections for distributed presentation. This is why dual path,
lobster-trap delivery systems are preferred by some organizations.
For example, SGML for archival, PDF for presentation.
So where we specify association of processing specifications to
the document instance, if we make the programmer's job easy, we
may make the information manager's job hard. Guess who buys the system?
Encapulated objects give them both what they need on the
front end. The price is paid ten years later when one has to
rehost, recover, or repurpose. OTH, for information that
does not live long, the encapsulated object is a good idea
which is why I thought it strange that HTML is SGML. Since
this is a manageement production problem, it can be solved by
a production approach (e.g, the lobster trap).
Look for the middle way. Catalogs and menu selections look
promising.
len
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From ht at cogsci.ed.ac.uk Thu Mar 6 15:54:32 1997
From: ht at cogsci.ed.ac.uk (Henry S. Thompson)
Date: Mon Jun 7 16:57:31 2004
Subject: BNF
Message-ID: <1440.199703061554@grogan.cogsci.ed.ac.uk>
Has anyone extracted the XML BNF into a usable form?
ht
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From tbray at textuality.com Thu Mar 6 16:13:47 1997
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun 7 16:57:31 2004
Subject: BNF
Message-ID: <3.0.32.19970306081217.00980960@pop.intergate.bc.ca>
At 03:54 PM 3/6/97 GMT, Henry S. Thompson wrote:
>Has anyone extracted the XML BNF into a usable form?
Should take about 45 seconds, working from the XML source for the
spec... there are a few pointed-out-but-as-yet-unfixed holes though. -T.
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From ht at cogsci.ed.ac.uk Thu Mar 6 16:33:03 1997
From: ht at cogsci.ed.ac.uk (Henry S. Thompson)
Date: Mon Jun 7 16:57:31 2004
Subject: BNF
In-Reply-To: Tim Bray's message of Thu, 06 Mar 1997 08:12:26 -0800
References: <3.0.32.19970306081217.00980960@pop.intergate.bc.ca>
Message-ID: <1457.199703061632@grogan.cogsci.ed.ac.uk>
Tim writes:
> >Has anyone extracted the XML BNF into a usable form?
>
> Should take about 45 seconds, working from the XML source for the
> spec... there are a few pointed-out-but-as-yet-unfixed holes though. -T.
Is that an offer? Or should I know where the XML source is?
ht
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From dgd at cs.bu.edu Thu Mar 6 16:57:29 1997
From: dgd at cs.bu.edu (David Durand)
Date: Mon Jun 7 16:57:31 2004
Subject: API thoughts...
In-Reply-To:
Message-ID:
At 2:14 PM -0800 3/5/97, Bill Smith wrote:
>> Well, the Lark event-stream API sure looks & feels like a bunch of
>> callbacks. You make a Lark object, call its readXML() method with one
>> argument being a Handler object; Handler being a data-less class that
>> just has a bunch of methods called things like doPI() and doStartTag() and
>> doEntityReference() and doText() and so on; you'd normally subclass Handler
>> replacing the methods for the events you wanted to see, and pass in
>> that kind of object. Lark calls these upon recognizing
>> the constructs in the input stream, passing the byte offset info, the
>> element & entity stack (*if* you're treebuilding), and other currently
>> relevant info. These methods are all booleans; if any returns true,
>> Lark stops and returns control to whoever called readXML().
I like this boolean approach -- it lets the Handler object take back the
flow of control, pretty easily. If Lark could break PCDATA up when the
buffer stalls, you could easily implement a Browser-style application.
>Another way to do this is to have the Lark object (or interface) define the
>event methods rather than have a separate Handler object. When it's time to
>parse something, create a subclass that overrides the (standard) event methods
>for the Lark object.
I don't like this quite as well for a generic API as I can see the use of
Handler objects that don't know how to parse -- they can be glued to other
event sources to run off of DB engines -- or even broken across a network
to provide and XML event-stream mechanism...
>A possible advantage to this method is that it makes clear the inheritance
>relationship between the "standard" parser and something more specific. It
>is also "easier" to create a more specific parser from an exisiting parser
>object - simply subclass the existing parser and override the methods
>required to provide the desired new functionality.
If the methods in the standard parser don't do something you are interested
in, you still have to do override them all -- and I don't see what default
behavior would make sense other than "do nothing". It seems that you could
get the benefits of having that simply by supplying a predefined Handler
object that has null implementations for its methods.
>The Lark model "hides" the inheritance relationship in the Handler object
>making it necessary to look inside a Lark object to determine the type of
>a given parser (something you might need to do when debugging). An
>alternative is to create a new parser object that contains a subclassed event
>handler. This makes it possible to distinguish the type of parser at the
>"outer" level but requires two new objects instead of one to perform the
>subclass.
The debugging issue is certanly a bit inconvenient. If we use interfaces
rather than classes for the API (almost certainly a good idea), then we can
certainly create a Parser that implements Handler.
>I'm not a parser expert so the subclass model may not make any sense but this
>is a mechanism I have successfully used building other object-oriented
>(including GUI-based) systems. I have also used callbacks but find them most
>useful when forced to use C or other non-object-based languages.
I think that subclassing here just means that I might be forced to pull in
parser baggage (or null methods) when I want to implement a parser-free
event handler or event generator.
-- David
_________________________________________
David Durand dgd@cs.bu.edu \ david@dynamicDiagrams.com
Boston University Computer Science \ Sr. Analyst
http://www.cs.bu.edu/students/grads/dgd/ \ Dynamic Diagrams
--------------------------------------------\ http://dynamicDiagrams.com/
MAPA: mapping for the WWW \__________________________
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From jjc at jclark.com Fri Mar 7 05:10:44 1997
From: jjc at jclark.com (James Clark)
Date: Mon Jun 7 16:57:31 2004
Subject: Associating DSSSL style sheets with documents
Message-ID: <2.2.32.19970307050036.00a8a6f8@jclark.com>
At 08:41 06/03/97 -0500, Gavin Nicol wrote:
>>So if we are going to use a catalog entry (and I'm not yet convinced
>>this is the best solution) I would suggest having a simple DSSSL entry
>>which looks like:
>>
>>DSSSL spec.dsl
>
>The problem with this is that it is application specific. How do I
>tell a browser that DSSSL === DSSSL Stylesheet?
I don't see your point at all. Why do you need to tell the browser? The
browser knows that the DSSSL keyword in the catalog designates a DSSSL
specification (it could include transformation specs as well as stylesheets)
the same way it knows that the SGMLDECL entry designates an SGML
specification, and the same way it would know that a type of
dsssl-specification in a SEMANTICS entry
SEMANTICS "dsssl-specification" "Stylesheet title" foo.dsl
means DSSSL specification.
James
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From nmikula at edu.uni-klu.ac.at Fri Mar 7 08:05:36 1997
From: nmikula at edu.uni-klu.ac.at (Norbert H. Mikula)
Date: Mon Jun 7 16:57:31 2004
Subject: Which style first ? Re: Associating DSSSL style sheets
with documents
References: <2.2.32.19970306122748.00ad4c24@jclark.com>
Message-ID: <33204BB2.C96@edu.uni-klu.ac.at>
James Clark wrote:
> Where the content provider knows which style they want, they can use a
> fragment spec in the URL to pick out a particular spec from the document.
The scenario I have in mind is putting documents onto a WWW server.
In this case the content provider can not know whether the user (-agent)
downloads for online-rendering, hardcopy or a mixture thereof.
> >What about adding an attribute with a list of catagories
> >of DSSSL engines as possible attribute values.
> >
> >For instance : output (hardcopy|online|....?) hardcopy
>
> What other categories are there? If a style sheet is for online use, then
> it has to use the scroll flow object, which means it ought to list the
> online feature in the features element type form.
Yes, that is an interesting approach. However, if we have multiple
style-specs. in one document, for instance hardcopy and on-line.
Would it not be the case, that you would find the online feature
in the features element type form regardless of what stylespec. the
user agent is really interested in ?
I don't have the DSSSL specs at hand right now. So I hope it makes sense
what I am saying. If not -> (element my_comment (empty-sosofo)) ;-)
> Maybe a DSSSL engine
> could use this, or maybe it could look to see which flow object classes the
> spec uses.
I don't think that this is a practical approach. If a style-spec. is
supposed to be used for online-rendering, is it really a must to use
scroll-fo ? What about simple-page-seq. ? I know that this is not
something
somebody would intuitively do, but why not. I could envision a browser
that
uses simple-page-seq. For large document instances the browser takes
advantage of the explicit information in the document instances combined
with the style-spec. In other words, chapter starts a new page. The
browser
would have to render only the active page.
I guess it might also foster more reusable style modules.
It remains to be discussed how the document should be divided into
more Internet suitable sizes and how the user agent would/should deal
with this and to what extend it has/should have/must have an influence
on the design of stylesheets. I especially have in mind the case
of XML entity treatment.
Makes sense ?
However, I can see no point in scanning the whole document just to
deduce
somehow what class of style-spec. it belongs to. It sounds to me like
trying
to figure out the semantics of a paragraph without using markup.
--
Best regards,
Norbert H. Mikula
=====================================================
= SGML, DSSSL, Intra- & Internet, AI, Java
=====================================================
= mailto:nmikula@edu.uni-klu.ac.at
= http://www.edu.uni-klu.ac.at/~nmikula
=====================================================
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From ksaito at flab.fujitsu.co.jp Fri Mar 7 11:26:37 1997
From: ksaito at flab.fujitsu.co.jp (ksaito@flab.fujitsu.co.jp)
Date: Mon Jun 7 16:57:31 2004
Subject: Which style first ? Re: Associating DSSSL style sheets
with documents
In-Reply-To: <33204BB2.C96@edu.uni-klu.ac.at>
Message-ID: <9703071123.AA01303@sanma.flab.fujitsu.co.jp>
We have an idea about association too. Now we are implementing
it in our DSSSL System and testing.
Please suggest about this idea if you can. Following is
description.
-- Part.1 Basic concept
This idea is declear style-sheet as external entity and never
refer it in SGML document. And gives that entity notation which
indicate style-sheet.
ex) In SGML document prolog.
In this exapmle, style-sheet is described with DSSSL notation
and that is identified by sytem identifier "style-sheet.dsl".
( this DSSSL notation identifier is virtual).
Application recognizes style-sheet by following steps.
1) checks declared entieies.
2) checks notion of these external entities.
3) if some entities have notation which means DSSSL style-sheet,
then that application uses these external entity as style-sheet.
That SGML document don't refer style-sheet external entities.
Therefore, old application never expand content of style-sheet
in document. I think, this way will not violate SGML standard.
-- Part.2 Associate multiple style-sheets.
To relate more than one style-sheet to single SGML document,
prepare these style-sheets as external entities respectively.
And declares these entities as following example.
ex.1)
About DSSSL, to decide which style-sheet is used, application refers
"desc" attribute of transformation-specification or style-specification
element.
Otherwise entity attribute is useful for selecting style-sheet.
ex.2)
Since DSSSL can create multiple style-sheet as single file, DSSSL
does not need this method. But this is useful for other style-sheet
language.
-- Part.3 Way of associate different language style-sheets.
To relate more than one style-sheets which were described with
different notations to single document, prepare these style-sheets
as external entities respectively. And declare these as following
example.
ex)
In this example, to decide which style-sheet is used, application
refers entities's notations. If the application supports DSSSL only,
then use "style-sheet.dsl", otherwise supports CSS only then use
"style-sheet.css". Application can select style-sheet by its notation.
---
KAZUMI Saito
Fujitsu Laboratories Ltd.
Information Service Architecture Lab.
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From gtn at ebt.com Fri Mar 7 13:52:03 1997
From: gtn at ebt.com (Gavin Nicol)
Date: Mon Jun 7 16:57:31 2004
Subject: Associating DSSSL style sheets with documents
In-Reply-To: <2.2.32.19970307050036.00a8a6f8@jclark.com> (message from James Clark on Fri, 07 Mar 1997 12:00:36 +0700)
Message-ID: <199703071349.IAA07589@nathaniel.ebt>
>>>So if we are going to use a catalog entry (and I'm not yet convinced
>>>this is the best solution) I would suggest having a simple DSSSL entry
>>>which looks like:
>>>
>>>DSSSL spec.dsl
>>
>>The problem with this is that it is application specific. How do I
>>tell a browser that DSSSL === DSSSL Stylesheet?
>
>I don't see your point at all. Why do you need to tell the browser? The
>browser knows that the DSSSL keyword in the catalog designates a DSSSL
>specification (it could include transformation specs as well as stylesheets)
>the same way it knows that the SGMLDECL entry designates an SGML
>specification, and the same way it would know that a type of
>dsssl-specification in a SEMANTICS entry
My point is that you are proposing adding an extension that is not
standardised or generalised. Given that we all agree that a DSSSL
keyword should be supported, there isn't much of a problem, though
I would prefer adding some descriptive label to it.
SEMANTICS is a more general solution: you could use it for
CSS or whatever else you wanted/supported.
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From ht at cogsci.ed.ac.uk Fri Mar 7 16:41:05 1997
From: ht at cogsci.ed.ac.uk (Henry S. Thompson)
Date: Mon Jun 7 16:57:32 2004
Subject: XML BNF
Message-ID: <1670.199703071640@grogan.cogsci.ed.ac.uk>
This was taken semi-automatically from the XML source for the spec
document treated as SGML.
Tim's version thereof had some bogus PC = codes in it, which I THINK
I've removed. There is at least one real (i.e. in the original) bug
I've noticed so far -- there should be a * after the second-last ) in
the DOCTYPE disjunct of the Markup production.
Use at your own risk.
ht
-------------
S ::= (#x0020 | #x000a | #x000d | #x0009 | #x3000)+
Character ::= #x09 | #x0A | #x0D | [#x20-#xFFFD] | [#x00010000-#x7FFFFFFF] /* any ISO 10646 31-bit code, FFFE and FFFF excluded */
BaseChar ::= [#x41-#x5A] | [#x61-#x7A] /* Latin 1 upper and lowercase */
| #xAA | #xB5 | #xBA | [#xC0-#xD6] | [#xD8-#xF6] /* Latin 1 supplementary */
| [#xF8-#xFF] /* Latin 1 supplementary */
| [#x0100-#x017F] /* Extended Latin-A */
| [#x0180-#x01F5] | [#x01FA-#x0217] /* Extended Latin-B */
| [#x0250-#x02A8] /* IPA Extensions */
| [#x02B0-#x02B8] | [#x02BB-#x02C1] | [#x02E0-#x02E4] /* Spacing Modifiers */
| #x037A | #x0386 | [#x0388-#x038A] | #x038C | [#x038E-#x03A1] | [#x03A3-#x03CE] | [#x03D0-#x03D6] | #x03DA | #x03DC | #x03DE | #x03E0 | [#x03E2-#x03F3] /* Greek and Coptic */
| [#x0401-#x040C] | [#x040E-#x044F] | [#x0451-#x045C] | [#x045E-#x0481] | [#x0490-#x04C4] | [#x04C7-#x04C8] | [#x04CB-#x04CC] | [#x04D0-#x04EB] | [#x04EE-#x04F5] | [#x04F8-#x04F9] /* Cyrillic */
| [#x0531-#x0556] | [#x0559-#x055A] | [#x0561-#x0587] /* Armenian */
| [#x05D0-#x05EA] | [#x05F0-#x05F2] /* Hebrew */
| [#x0621-#x063A] | [#x0641-#x064A] | [#x0671-#x06B7] | [#x06BA-#x06BE] | [#x06C0-#x06CE] | [#x06D0-#x06D3] | [#x06D5-#x06D6] | [#x06E5-#x06E6] /* Arabic */
| [#x0905-#x0939] | #x093D | [#x0958-#x0961] /* Devanagari */
| #x0981 | [#x0985-#x098C] | [#x098F-#x0990] | [#x0993-#x09A8] | [#x09AA-#x09B0] | #x09B2 | [#x09B6-#x09B9] | [#x09DC-#x09DD] | [#x09DF-#x09E1] | [#x09F0-#x09F1] /* Bengali */
| [#x0A05-#x0A0A] | [#x0A0F-#x0A10] | [#x0A13-#x0A28] | [#x0A2A-#x0A30] | [#x0A32-#x0A33] | [#x0A35-#x0A36] | [#x0A38-#x0A39] /* Gurmukhi */
| [#x0A8F-#x0A91] | [#x0A93-#x0AA8] | [#x0AAA-#x0AB0] | [#x0AB2-#x0AB3] | [#x0AB5-#x0AB9] | #x0AE0 /* Gujarati */
| [#x0B05-#x0B0C] | [#x0B0F-#x0B10] | [#x0B13-#x0B28] | [#x0B2A-#x0B30] | [#x0B32-#x0B33] | [#x0B36-#x0B39] | #x0B3D | [#x0B5C-#x0B5D] | [#x0B5F-#x0B61] /* Oriya */
| [#x0B85-#x0B8A] | [#x0B8E-#x0B90] | [#x0B92-#x0B95] | [#x0B99-#x0B9A] | #x0B9C | [#x0B9E-#x0B9F] | [#x0BA3-#x0BA4] | [#x0BA8-#x0BAA] | [#x0BAE-#x0BB5] | [#x0BB7-#x0BB9] /* Tamil */
| [#x0C05-#x0C0C] | [#x0C0E-#x0C10] | [#x0C12-#x0C28] | [#x0C2A-#x0C33] | [#x0C35-#x0C39] | [#x0C60-#x0C61] /* Telugu */
| [#x0C85-#x0C8C] | [#x0C8E-#x0C90] | [#x0C92-#x0CA8] | [#x0CAA-#x0CB3] | [#x0CB5-#x0CB9] | #x0CDE | [#x0CE0-#x0CE1] /* Kannada */
| [#x0D05-#x0D0C] | [#x0D0E-#x0D10] | [#x0D12-#x0D28] | [#x0D2A-#x0D39] | [#x0D60-#x0D61] /* Malayalam */
| [#x0E01-#x0E2E] | #x0E30 | [#x0E32-#x0E33] | [#x0E40-#x0E45] /* Thai */
| [#x0E81-#x0E82] | #x0E84 | [#x0E87-#x0E88] | #x0E8A | #x0E8D | [#x0E94-#x0E97] | [#x0E99-#x0E9F] | [#x0EA1-#x0EA3] | #x0EA5 | #x0EA7 | [#x0EAA-#x0EAB] | [#x0EAD-#x0EAE] | #x0EB0 | [#x0EB2-#x0EB3] | #x0EBD | [#x0EC0-#x0EC4] | [#x0EDC-#x0EDD] /* Lao */
| [#x0F40-#x0F47] | [#x0F49-#x0F69] /* Tibetan */
| [#x10A0-#x10C5] | [#x10D0-#x10F6] /* Georgian */
| [#x1100-#x1159] | [#x115F-#x11A2] | [#x11A8-#x11F9] /* Hangul Jamo */
| [#x1E00-#x1E9B] | [#x1EA0-#x1EF9] /* Add'l Extended Latin */
| [#x1F00-#x1F15] | [#x1F18-#x1F1D] | [#x1F20-#x1F45] | [#x1F48-#x1F4D] | [#x1F50-#x1F57] | #x1F59 | #x1F5B | #x1F5D | [#x1F5F-#x1F7D] | [#x1F80-#x1FB4] | [#x1FB6-#x1FBC] | #x1FBE | [#x1FC2-#x1FC4] | [#x1FC6-#x1FCC] | [#x1FD0-#x1FD3] | [#x1FD6-#x1FDB] | [#x1FE0-#x1FEC] | [#x1FF2-#x1FF4] | [#x1FF6-#x1FFC] /* Greek Extensions */
| #x207F /* Super-, subscripts */
| #x2102 | #x2107 | [#x210A-#x2113] | #x2115 | [#x2118-#x211D] | #x2124 | #x2126 | #x2128 | [#x212A-#x212D] | [#x212F-#x2131] | [#x2133-#x2138] /* Letterlike Symbols */
| [#x2160-#x2182] /* Number forms */
| [#x3041-#x3094] /* Hiragana */
| [#x30A1-#x30FA] /* Katakana */
| [#x3105-#x312C] /* Bopomofo */
| [#x3131-#x318E] /* Hangul Jamo */
| [#xAC00-#xD7A3]
| [#xFB00-#xFB06] | [#xFB13-#xFB17] | [#xFB1F-#xFB28] | [#xFB2A-#xFB36] | [#xFB38-#xFB3C] | #xFB3E | [#xFB40-#xFB41] | [#xFB43-#xFB44] | [#xFB46-#xFB4F] /* Alphabetic presentation forms */
| [#xFB50-#xFBB1] | [#xFBD3-#xFD3D] | [#xFD50-#xFD8F] | [#xFD92-#xFDC7] | [#xFDF0-#xFDF8] | [#xFE70-#xFE72] | #xFE74 | [#xFE76-#xFEFC] /* Arabic presentation forms */
| [#xFF21-#xFF3A] | [#xFF41-#xFF5A] | [#xFF66-#xFF6F] | [#xFE71-#xFF9D] | [#xFFA0-#xFFBE] | [#xFFC2-#xFFC7] | [#xFFCA-#xFFCF] | [#xFFD2-#xFFD7] | [#xFFDA-#xFFDC] /* Half- and fullwidth forms */
Ideographic ::= [#x4E00-#x9FA5] | [#xF900-#xFA2D] | #x3007 | [#x3021-#x3029]
CombiningChar ::= [#x0300-#x0361] | [#x0483-#x0486] | [#x0591-#x05C4] | [#x064B-#x0652] | #x0670 | [#x06D7-#x06DC] | [#x06DD-#x06DF] | [#x06E0-#x06E4] | [#x06E7-#x06E8] | [#x06EA-#x06ED] | [#x0901-#x0903] | [#x093E-#x094C] | #x094D | [#x0951-#x0954] | [#x0962-#x0963] | [#x0981-#x0983] | #x09BC | #x09BE | #x09BF | [#x09C0-#x09C4] | [#x09C7-#x09C8] | [#x09CB-#x09CD] | #x09D7 | [#x09E2-#x09E3] | #x0A02 | #x0A3C | #x0A3E | #x0A3F | [#x0A40-#x0A42] | [#x0A47-#x0A48] | [#x0A4B-#x0A4D] | [#x0A70-#x0A71] | [#x0A81-#x0A83] | #x0ABC | [#x0ABE-#x0AC5] | [#x0AC7-#x0AC9] | #x0ACB | #x0ACC | [#x0B01-#x0B03] | #x0B3C | [#x0B3E-#x0B43] | [#x0B47-#x0B48] | [#x0B4B-#x0B4C] | [#x0B56-#x0B57] | [#x0B82-#x0B83] | [#x0BBE-#x0BC2] | [#x0BC6-#x0BC8] | [#x0BCA-#x0BCC] | #x0BD7 | [#x0C01-#x0C03] | [#x0C3E-#x0C44] | [#x0C46-#x0C48] | [#x0C4A-#x0C4D] | [#x0C55-#x0C56] | [#x0C82-#x0C83] | [#x0CBE-#x0CC4] | [#x0CC6-#x0CC8] | [#x0CCA-#x0CCC] | [#x0CD5-#x0CD6] | [#x0D02-#x0D03] | [#x0D3E-#x0D43] | [#x0D46-!
#x0D48] | [#x0D4A-#x0D4C] | #x0D57 | #x0E31 | [#x0E34-#x0E3A] | [#x0E47-#x0E4E] | #x0EB1 | [#x0EB4-#x0EB9] | [#x0EBB-#x0EBC] | [#x0EC8-#x0ECD] | [#x0F18-#x0F19] | #x0F35 | #x0F37 | #x0F39 | #x0F3E | #x0F3F | [#x0F71-#x0F84] | [#x0F86-#x0F8B] | [#x0F90-#x0F95] | #x0F97 | [#x0F99-#x0FAD] | [#x0FB1-#x0FB7] | #x0FB9 | [#x20D0-#x20DC] | #x20E1 | [#x302A-#x302F] | #x3099 | #x309A | #xFB1E | [#xFE20-#xFE23]
Letter ::= (BaseChar CombiningChar*) | Ideographic
Digit ::= [#x0030-#x0039] /* ISO 646 digits */
| [#x0660-#x0669] /* Arabic-Indic digits */
| [#x06F0-#x06F9] /* Eastern Arabic-Indic digits */
| [#x0966-#x096F] /* Devanagari digits */
| [#x09E6-#x09EF] /* Bengali digits */
| [#x0A66-#x0A6F] /* Gurmukhi digits */
| [#x0AE6-#x0AEF] /* Gujarati digits */
| [#x0B66-#x0B6F] /* Oriya digits */
| [#x0BE7-#x0BEF] /* Tamil digits (no zero) */
| [#x0C66-#x0C6F] /* Telugu digits */
| [#x0CE6-#x0CEF] /* Kannada digits */
| [#x0D66-#x0D6F] /* Malayalam digits */
| [#x0E50-#x0E59] /* Thai digits */
| [#x0ED0-#x0ED9] /* Lao digits */
| [#x0F20-#x0F29] /* Tibetan digits */
| [#xFF10-#xFF19] /* Fullwidth digits */
Ignorable ::= [#x200C-#x200F] /* zw layout */
| [#x202A-#x202E] /* bidi formatting */
| [#x206A-#x206F] /* alt formatting */
| #xFEFF /* zw nonbreak space */
Extender ::= #x00B7 | #x02D0 | #x02D1 | #x0387 | #x0640 | #x0E46 | #x0EC6 | #x3005 | [#x3031-#x3035] | [#x309B-#x309E] | [#x30FC-#x30FE] | #xFF70 | #xFF9E | #xFF9F
MiscName ::= '.' | Ignorable | Extender
NameChar ::= Letter | Digit | MiscName
Name ::= (Letter | '-') (NameChar)*
Nmtoken ::= (NameChar)+
Nmtokens ::= Nmtoken (S Nmtoken)*
Literal ::= '"' ([^"] | PEReference | CharRef)* '"' | "'" ([^'] | PEReference | CharRef)* "'"
QuotedCData ::= '"' ([^"<] | Reference)* '"' | "'" ([^'<] | Reference)* "'"
Trivial ::= (PCData | Markup)*
Eq ::= S? '=' S?
Markup ::= '<' Name (S Name Eq QuotedCData)* S? '>' /* start-tags */
| '' Name S? '>' /* end-tags */
| '<' Name (S Name Eq QuotedCData)* S? '/>' /* empty elements */
| '&' Name ';' /* entity references */
| '' [0-9]+ ';' /* character references */
| '&u-' Hex4 ';' /* character references */
| '' /* comments */
| '' /* CDATA sections */
| '') | ('"' [^"]* '"') | ("'" [^']* "'") | conditionalSect | [^]]* ) ']')? '>' /* doc type declaration */
| '' [^?]* ('?' [^>]+)* '?>' /* processing instructions */
PCData ::= [^<&]*
Comment ::= ''
PI ::= '' Name S [^?]* ('?' [^>]+)* '?>'
CDSect ::= CDStart CData CDEnd
CDStart ::= '])) [^]]*)*
CDEnd ::= ']]>'
document ::= Prolog element Misc*
Prolog ::= XMLDecl Misc* doctypedecl? Misc*
XMLDecl ::= ''
VersionInfo ::= S 'version' Eq ('"1.0"' | "'1.0'")
Misc ::= Comment | PI | S
doctypedecl ::= ''
internalsubset ::= elementdecl | AttlistDecl | EntityDecl | NotationDecl | PEReference | conditionalSect | PI | S | Comment
RMDecl ::= 'RMD' Eq ('NONE' | 'INTERNAL' | 'ALL')
STag ::= '<' Name (S Attribute)* S? '>'
Attribute ::= Name Eq QuotedCData
ETag ::= '' Name S? '>'
EmptyElement ::= '<' Name (S Attribute)* S? '/>';
content ::= (element | PCData | Reference | CDSect | PI | Comment)*
element ::= EmptyElement /* empty elements */
| STag content ETag
elementdecl ::= ''
Mixed ::= '(' S? '#PCDATA' ( S? '|' S? Name )* S? ')*'
| '(' S? '#PCDATA' S? ')'
elements ::= (choice | seq) ('?' | '*' | '+')?
cp ::= (Name | choice | seq) ('?' | '*' | '+')?
cps ::= S? cp S?
choice ::= '(' cps ('|' cps)+ ')'
seq ::= '(' cps (',' cps)* ')'
AttlistDecl ::= ''
AttDef ::= S Name S AttType S Default
AttType ::= StringType | TokenizedType | EnumeratedType
StringType ::= 'CDATA'
TokenizedType ::= 'ID'
EnumeratedType ::= NotationType | Enumeration
NotationType ::= 'NOTATION' S '(' S? Name (S? '|' S? Name)* S? ')'
Enumeration ::= '(' S? Nmtoken (S? '|' S? Nmtoken)* S? ')'
Default ::= '#REQUIRED' | '#IMPLIED' | ('#FIXED'? QuotedCData)
conditionalSect ::= ''
CSKey ::= PEReference | 'INCLUDE' | 'IGNORE'
csdata ::= internalsubset
Hex ::= [0-9a-fA-F]
Hex4 ::= Hex Hex Hex Hex
CharRef ::= '' [0-9]+ ';' | '&u-' Hex4 ';'
Reference ::= EntityRef | CharRef
EntityRef ::= '&' Name ';'
PEReference ::= '%' Name ';'
EntityDecl ::= '' /* General entities */
| '' /* Parameter entities */
EntityDef ::= Literal | ExternalDef;
ExternalDef ::= ExternalID NDataDecl?
ExternalID ::= 'SYSTEM' S SystemLiteral
SystemLiteral ::= '"' [^"]* '"' | "'" [^']* "'"
NDataDecl ::= S 'NDATA' S Name
EncodingDecl ::= S 'encoding' Eq QEncoding
EncodingPI ::= ''
QEncoding ::= '"' Encoding '"' | "'" Encoding "'"
Encoding ::= LatinName
LatinName ::= [A-Za-z] ([A-Za-z0-9] | '-' | '.')* /* Name comprising only Latin characters */
NotationDecl ::= ''
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From tbray at textuality.com Fri Mar 7 17:23:20 1997
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun 7 16:57:32 2004
Subject: XML BNF
Message-ID: <3.0.32.19970307092116.0098bd70@pop.intergate.bc.ca>
At 04:40 PM 3/7/97 GMT, Henry S. Thompson wrote:
>This was taken semi-automatically from the XML source for the spec
>document treated as SGML.
Thanks Henry - should be useful. Don't throw those scripts away, because
I expect this current WG8 process (ANSI is meeting as I type this)
to add to the already significant list of changes built-up for
a new rev.
What Henry called "bogus PC codes" are perfectly legit ISO
nonbreaking spaces I believe, put in by Jon when he was wrestling
with Jade to pretty up the produtions. Anyhow, they'll go in the
next rev. -Tim
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From bosak at atlantic-83.Eng.Sun.COM Sat Mar 8 04:28:29 1997
From: bosak at atlantic-83.Eng.Sun.COM (Jon Bosak)
Date: Mon Jun 7 16:57:32 2004
Subject: WWW6 demo change
Message-ID: <199703080427.UAA02628@boethius.eng.sun.com>
In light of the fact that there will be an all-day Developer's Day
session on XML/SGML/DSSSL at the World Wide Web Conference, Tim Bray
and I (who are in effect coordinating this event) have decided to move
the demo session formerly scheduled for Thursday evening, April 10, to
Friday afternoon, April 11. The subject of the demos fits in very
nicely with the rest of the session and will give the people
presenting demos a larger audience.
As before, I encourage anyone who has a Web-related XML or DSSSL tool
to demonstrate to get in touch with me if you haven't done so already.
Jon
----------------------------------------------------------------------
Jon Bosak, Online Information Technology Architect, Sun Microsystems
----------------------------------------------------------------------
2550 Garcia Ave., MPK17-101, Mountain View, California 94043
Davenport Group::SGML Open::NCITS V1::ISO/IEC JTC1/SC18/WG8::W3C XML
If a man look sharply and attentively, he shall see Fortune; for
though she be blind, yet she is not invisible. -- Francis Bacon
----------------------------------------------------------------------
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From nmikula at edu.uni-klu.ac.at Sat Mar 8 08:57:15 1997
From: nmikula at edu.uni-klu.ac.at (Norbert Mikula)
Date: Mon Jun 7 16:57:32 2004
Subject: NXP : New Beta (public identifiers, catalog, more parameter entities)
Message-ID:
To all (potential) users of NXP
I have put a new beta release onto my WWW server.
Please have a look at :
http://www.edu.uni-klu.ac.at/~nmikula/NXP/beta
Release Notes :
* Includes Public Identifiers
* Includes catalogs (incl. DELEGATE and CATALOG)
(see http://www.edu.uni-klu.ac.at/~nmikula/NXP/beta/catalog.html)
* Parameter Entitities
More places where paramater entities can be used in the
internal subset (please look at entities.xml and entities.dtd
an the same directory). Im still working on this, but *please*
do some torture testing with it and send me the results. I am
especially interested in the cases that I think should work,
but where one additional whitespace or so make the parser
fail.
* Name conflicts
Conflicts of SGML keywords and SGML names should be all solved
now. I guess I will rewrite the handling of all this by
introducing more lexical states.
* Attribute defaults should be handled correctly now
Have fun :-)
!!!!! Special thanks to all of you which have been testing and
contributing to NXP so far. !!!!!!
FYI: To re-compile you need the last JavaCC.
Could someone take the burden and do some testing
on the validation feature ( -v ). Please ....
(see also : http://www.edu.uni-klu.ac.at/~nmikula/NXP/README.html)
Best regards,
Norbert H. Mikula
=====================================================
= SGML, DSSSL, Intra- & Internet, AI, Java
=====================================================
= mailto:nmikula@edu.uni-klu.ac.at
= http://www.edu.uni-klu.ac.at/~nmikula
=====================================================
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From Peter at ursus.demon.co.uk Mon Mar 10 01:19:13 1997
From: Peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun 7 16:57:32 2004
Subject: NXP : New Beta (public identifiers, catalog, more parameter entities)
Message-ID: <4452@ursus.demon.co.uk>
In message Norbert Mikula writes:
> To all (potential) users of NXP
>
> I have put a new beta release onto my WWW server.
>
> Please have a look at :
>
> http://www.edu.uni-klu.ac.at/~nmikula/NXP/beta
>
> Release Notes :
>
> * Includes Public Identifiers
>
> * Includes catalogs (incl. DELEGATE and CATALOG)
> (see http://www.edu.uni-klu.ac.at/~nmikula/NXP/beta/catalog.html)
>
> * Parameter Entitities
>
> More places where paramater entities can be used in the
> internal subset (please look at entities.xml and entities.dtd
> an the same directory). Im still working on this, but *please*
> do some torture testing with it and send me the results. I am
Norbert,
I have been trying to get HTML2.0 to 'compile' under NXP. I have
changed the required things:
- -- -- to --* *--
-
%bar;
This appears to be OK under sgmls, but in NXP I think I have to write
(But I confess this syntax confuses me terribly and I may have got this wrong
:-)
I am sorry I can't send you examples, but the basic problem is to get HTML2.0
to work with as little editing as possible.
[...]
> Could someone take the burden and do some testing
> on the validation feature ( -v ). Please ....
I am aiming to do this as soon as the DTDs read in OK :-)
[...]
--
Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences
http://www.vsms.nottingham.ac.uk/
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From Peter at ursus.demon.co.uk Mon Mar 10 01:51:48 1997
From: Peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun 7 16:57:32 2004
Subject: XML QuotedCData question
Message-ID: <4455@ursus.demon.co.uk>
This article was posted on comp.text.sgml. I have replied there in general
terms, but think there is sufficient ambiguity in the draft that discussion
here may be useful. (I am struggling over parameter substitution at present).
P.
In article <3322BBE5.1E5C@east.sun.com>
eric.baatz@east.sun.com
"Eric Baatz - Sun Microsystems Labs BOS" writes:
> I've been looking over the 14-Nov-96 XML working draft and
>
> 1. I don't understand why XML's QuotedCData seems to allow
> constructs that look like references but aren't. (I am
> assuming that such constructs would make life difficult
> for parsers.) For example:
>
> "&fooref;"
>
> seems to be legal by applying the [^"<] part of the following
> eight times.
>
> QuotedCData := '"' ([^"<] | Reference)* '"' ...
>
> Is the XML draft not stating some restriction, such as "if it
> looks like a reference, it must be a reference"?
>
> 2. Is there a better place (perhaps more specific to XML) for me
> to post XML queries such as #1?
>
> Thank you in advance for any help.
>
>
> Eric Baatz
> Sun Microsystems Laboratories
> 2 Elizabeth Drive, MS UCHL03-207 (508) 442-0257
> Chelmsford, MA 01824 fax: (508) 250-5067
> USA Internet: eric.baatz@east.sun.com
>
--
Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences
http://www.vsms.nottingham.ac.uk/
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From jjc at jclark.com Mon Mar 10 03:22:58 1997
From: jjc at jclark.com (James Clark)
Date: Mon Jun 7 16:57:32 2004
Subject: Which style first ? Re: Associating DSSSL style sheets
with documents
Message-ID: <2.2.32.19970310031233.00a601c0@jclark.com>
At 20:23 07/03/97 +0900, ksaito@flab.fujitsu.co.jp wrote:
>
> We have an idea about association too. Now we are implementing
>it in our DSSSL System and testing.
>
> Please suggest about this idea if you can. Following is
>description.
>
>
>-- Part.1 Basic concept
>
> This idea is declear style-sheet as external entity and never
>refer it in SGML document. And gives that entity notation which
>indicate style-sheet.
>
>ex) In SGML document prolog.
>
> "ISO 10179-1996//NOTATION
> Document Style Semantics and Specification Language//EN">
>
> In this exapmle, style-sheet is described with DSSSL notation
>and that is identified by sytem identifier "style-sheet.dsl".
>( this DSSSL notation identifier is virtual).
>
> Application recognizes style-sheet by following steps.
> 1) checks declared entieies.
> 2) checks notion of these external entities.
> 3) if some entities have notation which means DSSSL style-sheet,
> then that application uses these external entity as style-sheet.
I don't think it's safe to assume that an entity is intended to be be used
as a style sheet for some document simply because it is declared in the
document with a style sheet notation. Suppose for example, somebody was
writing a book about DSSSL: they might declare each of their example style
sheets as being entities with the DSSSL notation. Another possibility is
that they might be declaring the entity with DSSSL notation so that they
could specify it as the style sheet to be used for some other document that
they refer it.
James
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From lee at sq.com Mon Mar 10 03:30:14 1997
From: lee at sq.com (lee@sq.com)
Date: Mon Jun 7 16:57:32 2004
Subject: XML QuotedCData question
Message-ID: <9703100330.AA04839@sqrex.sq.com>
The question about how to expand entities may arise, I think, because
XML, like SGML, is not layered.
Most programming languages talk explicitly about tokenisation,
or tokenization if you prefer :-), and in doing so explain how
the sequence of tokens that a compiler (say) sees is derived from
an input stream. Usually, comments are stripped at this stage,
and in languages such as C or SGML that have (in effect) macros,
the macros are expanded at input time.
I'd personally like to see a version of the XML spec in which there
was no S production, but rather a list of things that are self-delimiting
(such as <) and don't require whitespace; the explanation about
entities would then be clearer.
SGML entities can't all be expanded at input time, since some
of them are of differing types (e.g. external files) and must be
treated differently. I'm not sure whether this applies to XML
general entities or not, but it probably does -- do we have
NDATA entities?
Maybe when the syntax settles down finally I'll do that.
Lee
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From lee at sq.com Mon Mar 10 03:38:39 1997
From: lee at sq.com (lee@sq.com)
Date: Mon Jun 7 16:57:32 2004
Subject: Which style first ? Re: Associating DSSSL style sheets with documents
Message-ID: <9703100338.AA04880@sqrex.sq.com>
James wrote:
> I don't think it's safe to assume that an entity is intended to be be used
> as a style sheet for some document simply because it is declared in the
> document with a style sheet notation. Suppose for example, somebody was
> writing a book about DSSSL: they might declare each of their example style
> sheets as being entities with the DSSSL notation. [...]
This highlights a weakness, I think, present in XML; it's also present
in a different way in the WWW.
We don't clearly distinguish between the type of an object (e.g.
its data format, as determined by NOTATION or Mime Media Type) and
our intended use of it.
In HTML the link context -- IMG, A, META, LINK -- determines to
some extent how the "remote" resource is to be used, but if a
browser followes an -style link, the resulting action is
determined almost entirely by the MIME media type that's discovered.
I say almost, since Netscape Navigator has the target=(window/frame name)
mechanism to give someattempt at control.
Presumably (I admit I'm not up to date on the latest link draft!)
in XML the same sort of situation applies.
If so, one way to deal with it is to declare multiple NOTATIONs, one
for each action you want to use. There ought to be a #DEFAULT
notation for representing content negotiation -- what if a particular
link might return a JPEG or GIF or PNG image or even descriptive text,
but you can't in advance tell which?
Sorry, a longish message for a simple point.
Lee
--
Liam Quin, lee@sq.com | lq-text freely available Unix text retrieval
Senior Technical Consultant | FAQs: Metafont fonts, OPEN LOOK UI, OpenWindows
SoftQuad Inc. +1 416 544-9000 | xfonttool (Unix xfontsel in XView)
http://www.softquad.com/ | the barefoot programmer
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From ksaito at flab.fujitsu.co.jp Mon Mar 10 05:00:15 1997
From: ksaito at flab.fujitsu.co.jp (ksaito@flab.fujitsu.co.jp)
Date: Mon Jun 7 16:57:32 2004
Subject: Which style first ? Re: Associating DSSSL style sheets
with documents
In-Reply-To: <2.2.32.19970310031233.00a601c0@jclark.com>
Message-ID: <9703100458.AA01313@sanma.flab.fujitsu.co.jp>
James Clark wrote...
>I don't think it's safe to assume that an entity is intended to be be used
>as a style sheet for some document simply because it is declared in the
>document with a style sheet notation. Suppose for example, somebody was
>writing a book about DSSSL: they might declare each of their example style
>sheets as being entities with the DSSSL notation.
Thank you for your suggestion.
In your first example, that DSSSL book author does not use that example as
DSSSL style-sheet. What he want to say in NOTAION is "This is DSSSL file".
I think, about our idea, style-sheet entity's notation should mean
"THIS IS STYLE-SHEET which described in DSSSL". This idea needs STYLE-SHEET
notation.
About this,
lee@sq.com wrote ( thank you for your suggetion),
>We don't clearly distinguish between the type of an object (e.g.
>its data format, as determined by NOTATION or Mime Media Type) and
>our intended use of it.
Using NOTATION as intention for use of entity is not so good in SGML?
I recongnize our idea's weak points by your suggentions. What I like
my idea is no SGML extentions is not need. All of this idea is application
matter.
---------------------------------------------
?????? ??? ?????????
?? ??
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From lee at sq.com Mon Mar 10 05:37:56 1997
From: lee at sq.com (lee@sq.com)
Date: Mon Jun 7 16:57:32 2004
Subject: Which style first ? Re: Associating DSSSL style sheets with documents
Message-ID: <9703100537.AA05622@sqrex.sq.com>
> lee@sq.com wrote ( thank you for your suggetion),
> >We don't clearly distinguish between the type of an object (e.g.
> >its data format, as determined by NOTATION or Mime Media Type) and
> >our intended use of it.
>
> Using NOTATION as intention for use of entity is not so good in SGML?
I did not mean to imply that NOTATION was not a good thing to use
in conjunction with an entity in SGML.
I wanted to point out that it is used for two different purposes.
First, it is used to identify the type of the entity.
Second, it is often used to specify how to process the entity.
This assumes that all objects of the same type are always
processed in the same way. But this is not the case.
In any case, I would not expect an arbitrary style sheet to be
referred to as an SGML entity unless, as in James' example, you
want to refer to it explicitly in a document.
A style sheet is not normally part of the actual document itself.
If yuo want to link from a document to style sheet, you could
reasonably (I think) use processing instructions.
I don't yet know how XML will decide to do this.
> ---------------------------------------------
> ^[$BIY;NDL8&5f=j^[(J ^[$BL@@P8&^[(J ^[$B>pJs%5!<%S%98&5fIt^[(J
> ^[$B@FF#^[(J ^[$B0l
Message-ID: <9703100716.AA01315@sanma.flab.fujitsu.co.jp>
lee@sq.com wrote...
>I did not mean to imply that NOTATION was not a good thing to use
>in conjunction with an entity in SGML.
I'm sorry, I missunderstood your previous mail.
>I wanted to point out that it is used for two different purposes.
>First, it is used to identify the type of the entity.
>Second, it is often used to specify how to process the entity.
My idea is uses NOTATION as second purpose.
>In any case, I would not expect an arbitrary style sheet to be
>referred to as an SGML entity unless, as in James' example, you
>want to refer to it explicitly in a document.
No no, I don't want to refer in document. I want to refer from DTD.
Many of DSSSL style-sheet will be depend on DTD, I think. I want to
put style-sheet entity declearation on DTD.
>A style sheet is not normally part of the actual document itself.
About well-formed XML document which has no DTD, it is not so bad
that such document has style-sheet or pointer to style-sheet for
portability, I think.
>If yuo want to link from a document to style sheet, you could
>reasonably (I think) use processing instructions.
I think PI is not good solution. When I write style-sheet as PI,
I can only one style-sheet type. And since PI has no attribute or
notation, application can't recognize notaion of PI.
---------------------------------------------
KAZUMI Saito
Fujitsu Labotatories Ltd. ISA Lab.
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From jjc at jclark.com Mon Mar 10 07:40:31 1997
From: jjc at jclark.com (James Clark)
Date: Mon Jun 7 16:57:32 2004
Subject: Associating DSSSL style sheets with documents
Message-ID: <2.2.32.19970310072958.00addf08@jclark.com>
At 16:16 10/03/97 +0900, ksaito@flab.fujitsu.co.jp wrote:
>I think PI is not good solution. When I write style-sheet as PI,
>I can only one style-sheet type. And since PI has no attribute or
>notation, application can't recognize notaion of PI.
It depends on how the PI is designed. We could have a PI that looked like this:
James
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From ksaito at flab.fujitsu.co.jp Mon Mar 10 07:55:54 1997
From: ksaito at flab.fujitsu.co.jp (ksaito@flab.fujitsu.co.jp)
Date: Mon Jun 7 16:57:32 2004
Subject: Associating DSSSL style sheets with documents
In-Reply-To: <2.2.32.19970310072958.00addf08@jclark.com>
Message-ID: <9703100754.AA01316@sanma.flab.fujitsu.co.jp>
James Clark wrote...
>It depends on how the PI is designed. We could have a PI that looked like this:
>
>
>
If my understanding is correct "type" in above example is not
SGML attribute and SGML parser will not recognize it as attribute.
Is this correct?
---------------------------------------------
KAZUMI Saito
Fujitsu Labotatories Ltd. ISA Lab.
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From jjc at jclark.com Mon Mar 10 08:28:10 1997
From: jjc at jclark.com (James Clark)
Date: Mon Jun 7 16:57:32 2004
Subject: Associating DSSSL style sheets with documents
Message-ID: <2.2.32.19970310081739.00abd628@jclark.com>
At 16:54 10/03/97 +0900, ksaito@flab.fujitsu.co.jp wrote:
>James Clark wrote...
>>It depends on how the PI is designed. We could have a PI that looked like
this:
>>
>>
>>
>
>If my understanding is correct "type" in above example is not
>SGML attribute and SGML parser will not recognize it as attribute.
>Is this correct?
Right. Just like "version" in the XML PI that starts every SGML document.
James
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From ksaito at flab.fujitsu.co.jp Mon Mar 10 08:58:28 1997
From: ksaito at flab.fujitsu.co.jp (ksaito@flab.fujitsu.co.jp)
Date: Mon Jun 7 16:57:32 2004
Subject: Associating DSSSL style sheets with documents
In-Reply-To: <2.2.32.19970310081739.00abd628@jclark.com>
Message-ID: <9703100856.AA01317@sanma.flab.fujitsu.co.jp>
James Clark wrote...
>>If my understanding is correct "type" in above example is not
>>SGML attribute and SGML parser will not recognize it as attribute.
>>Is this correct?
>
>Right. Just like "version" in the XML PI that starts every SGML document.
Then, we need a PI parser in addtion to a SGML parser, don't we ?
---------------------------------------------
KAZUMI Saito
Fujitsu Labotatories Ltd. ISA Lab.
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From Peter at ursus.demon.co.uk Mon Mar 10 10:19:17 1997
From: Peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun 7 16:57:32 2004
Subject: XML QuotedCData question
Message-ID: <4473@ursus.demon.co.uk>
It seems that there is enough ambiguity or possible misinterpretation
that this is a problem unless tackled. If WG or ERB members are reading this
then they might wish to take note.
In message <9703100330.AA04839@sqrex.sq.com> lee@sq.com writes:
> The question about how to expand entities may arise, I think, because
> XML, like SGML, is not layered.
>
> Most programming languages talk explicitly about tokenisation,
> or tokenization if you prefer :-), and in doing so explain how
> the sequence of tokens that a compiler (say) sees is derived from
> an input stream. Usually, comments are stripped at this stage,
> and in languages such as C or SGML that have (in effect) macros,
> the macros are expanded at input time.
Agreed. And having come from C I think in those terms.
>
> I'd personally like to see a version of the XML spec in which there
> was no S production, but rather a list of things that are self-delimiting
> (such as <) and don't require whitespace; the explanation about
> entities would then be clearer.
I hadn't realised this (S) was the problem :-)
>
> SGML entities can't all be expanded at input time, since some
> of them are of differing types (e.g. external files) and must be
> treated differently. I'm not sure whether this applies to XML
> general entities or not, but it probably does -- do we have
> NDATA entities?
Entity substitution is very briefly defined in the draft. I don't know
what it's like in 8879 (and I'm not going to find out!).
I see the following problems:
- it is *possible* (though I think unlikely) that not everyone on the
ERB agrees as to what is meant to happen during substitution
- parser implementers may:
* find the spec not well-enough defined
* interpret it in different ways
- DTD implementers (i.e. those using PEs) may:
* find the spec not well-enough defined
* interpret it in 'incorrect' ways
I have found 'programming' in SGML one of the most tedious and
counter-intuitive things I have had to do. The primary problem has been
entities, though RE hasn't helped. I had only two ways of proceeding:
- if it failed with sgmls it was my fault
- Joe English helped a great deal by answering 'simple' questions
over e-mails.
I finally ended up with a complex, hairy, and totally non-intuitive way
(to non-SGML folk) set of DTDs and 'include' files. sgmls was the only
way that I could tell whether it was 'right'.
The only way that we can expect people to develop applications for XML
using entities is:
- be absolutely clear what we are doing
- be as consistent as possible with past practice in SGML and
provide guidance on conversion
- have 100% accurate parsers
- have very clear examples and torture tests
- have tutorials
My starting point would be to take HTML2.0 (or 3.2 or whatever), and make sure
that the spec is capable of 100% accuracy in deciding what should happen.
If not it needs revising.
At present the immediate problem arises for Norbert (since his is the only
validating parser we are working with) and those who are working with it.
However PEs are used for other things than validation - I used them to
'add directory names' to a 'list of files' (i.e. manipulation of the
location of general entities).
Above all, of course, the XML documents must be valid SGML documents and
they must give the same 'result' as when processed by sgmls.
P.
>
> Maybe when the syntax settles down finally I'll do that.
In a sense this is mainly the interpretation of the syntax and therefore
the documentation rather than the productions (have I got that right?)
P.
>
> Lee
>
>
> xml-dev: A list for W3C XML Developers
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
> To unsubscribe, send to majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
>
>
--
Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences
http://www.vsms.nottingham.ac.uk/
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From richard at light.demon.co.uk Mon Mar 10 13:47:37 1997
From: richard at light.demon.co.uk (Richard Light)
Date: Mon Jun 7 16:57:33 2004
Subject: Which style first ? Re: Associating DSSSL style sheets with documents
In-Reply-To: <9703100537.AA05622@sqrex.sq.com>
Message-ID:
In message <9703100537.AA05622@sqrex.sq.com>, lee@sq.com writes
>
>I wanted to point out that it is used for two different purposes.
>First, it is used to identify the type of the entity.
>Second, it is often used to specify how to process the entity.
>
>This assumes that all objects of the same type are always
>processed in the same way. But this is not the case.
>
>If yuo want to link from a document to style sheet, you could
>reasonably (I think) use processing instructions.
>I don't yet know how XML will decide to do this.
Just a thought. XML is planning to give give us a more powerful
[hyper]linking mechanism, with semantics on the links (yes?). Could we
use that mechanism to differentiate the differing roles of external
entities?
Richard Light
SGML and Museum Information Consultancy
richard@light.demon.co.uk
3 Midfields Walk
Burgess Hill
West Sussex RH15 8JA
U.K.
tel. (44) 1444 232067
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From jjc at jclark.com Mon Mar 10 14:33:36 1997
From: jjc at jclark.com (James Clark)
Date: Mon Jun 7 16:57:33 2004
Subject: Associating DSSSL style sheets with documents
Message-ID: <2.2.32.19970310142302.00a767b8@jclark.com>
At 17:56 10/03/97 +0900, ksaito@flab.fujitsu.co.jp wrote:
>James Clark wrote...
>>>If my understanding is correct "type" in above example is not
>>>SGML attribute and SGML parser will not recognize it as attribute.
>>>Is this correct?
>>
>>Right. Just like "version" in the XML PI that starts every SGML document.
>
>Then, we need a PI parser in addtion to a SGML parser, don't we ?
Any use of PIs requires that the application interpret the contents of the
PI. If you make the syntax of the PI the same as XML start-tags, then you
may be able to use your XML parser to parse them. (It would handy if XML
parsers provided a function that parsed a string like the contents of a
start-tag.)
James
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From jjc at jclark.com Mon Mar 10 14:33:49 1997
From: jjc at jclark.com (James Clark)
Date: Mon Jun 7 16:57:33 2004
Subject: XML QuotedCData question
Message-ID: <2.2.32.19970310142309.00ae0294@jclark.com>
At 09:24 10/03/97 GMT, Peter Murray-Rust wrote:
>Above all, of course, the XML documents must be valid SGML documents and
>they must give the same 'result' as when processed by sgmls.
You won't in general be able to parse valid XML documents with sgmls. Two
obvious reasons are that it doesn't support Unicode and it doesn't allow you
to change delimiters.
James
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From bosak at atlantic-83.Eng.Sun.COM Mon Mar 10 15:22:11 1997
From: bosak at atlantic-83.Eng.Sun.COM (Jon Bosak)
Date: Mon Jun 7 16:57:33 2004
Subject: Which style first ? Re: Associating DSSSL style sheets with documents
In-Reply-To: <9703100716.AA01315@sanma.flab.fujitsu.co.jp> (ksaito@flab.fujitsu.co.jp)
Message-ID: <199703101520.HAA04606@boethius.eng.sun.com>
[Kazumi Saito:]
| Many of DSSSL style-sheet will be depend on DTD, I think. I want to
| put style-sheet entity declearation on DTD.
I think that it's a mistake to think of stylesheets as having a
one-to-one relationship to DTDs. It is possible to make one
stylesheet work with many DTDs (though this is probably not a good
practice), and it is not only possible but also very useful to make a
number of different stylesheets work with one DTD.
| About well-formed XML document which has no DTD, it is not so bad that
| such document has style-sheet or pointer to style-sheet for
| portability, I think.
For XML this will be the typical case. Typical XML documents
transmitted over the Web will not have DTDs and will have to point to
(or include) one or more stylesheets that are intended to be used with
them.
Jon
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From bosak at atlantic-83.Eng.Sun.COM Mon Mar 10 15:28:57 1997
From: bosak at atlantic-83.Eng.Sun.COM (Jon Bosak)
Date: Mon Jun 7 16:57:33 2004
Subject: Associating DSSSL style sheets with documents
In-Reply-To: <2.2.32.19970310072958.00addf08@jclark.com> (message from James Clark on Mon, 10 Mar 1997 14:29:58 +0700)
Message-ID: <199703101525.HAA04608@boethius.eng.sun.com>
[James Clark:]
| >I think PI is not good solution. When I write style-sheet as PI,
| >I can only one style-sheet type. And since PI has no attribute or
| >notation, application can't recognize notaion of PI.
|
| It depends on how the PI is designed. We could have a PI that looked
| like this:
|
|
|
As the naive content producer, I like this approach. (I assume that
the value of the href can be any URL and that a browser that
understood this syntax would cache the stylesheet just as it would any
other recently retrieved resource.)
Jon
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From nmikula at edu.uni-klu.ac.at Mon Mar 10 15:30:31 1997
From: nmikula at edu.uni-klu.ac.at (Norbert H. Mikula)
Date: Mon Jun 7 16:57:33 2004
Subject: XML QuotedCData question
References: <9703100330.AA04839@sqrex.sq.com>
Message-ID: <33249550.2CA9@edu.uni-klu.ac.at>
lee@sq.com wrote:
> Most programming languages talk explicitly about tokenisation,
> or tokenization if you prefer :-), and in doing so explain how
> the sequence of tokens that a compiler (say) sees is derived from
> an input stream. Usually, comments are stripped at this stage,
> and in languages such as C or SGML that have (in effect) macros,
> the macros are expanded at input time.
I don't think that C and SGML/XML use or rather can use the
same principle of includes/macros.
C uses a pre-processor that resolves includes. Then the actual
compiler gets started without having to worry about includes
anymore. (To my understanding of things..)
For practical reasons, at least for XML processors for online
browsers, I think, we don't want to first do the include and then do
the parsing, keeping all that stuff in memory while we do so.
Furthermore I see problems arise if we have the following scenario :
Too much to do for a pre-processor, I guess, it can, or
at least should, include the appropriate external
entity only after it has parsed and resolved the content
of %Dos and %Unix.
I am not sure whether I have addressed what you had in mind,
but I do believe that XML is too smart for a pre-processor,
thus we need other ways to look at PE resolving.
--
Best regards,
Norbert H. Mikula
=====================================================
= SGML, DSSSL, Intra- & Internet, AI, Java
=====================================================
= mailto:nmikula@edu.uni-klu.ac.at
= http://www.edu.uni-klu.ac.at/~nmikula
=====================================================
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From tallen at sonic.net Mon Mar 10 15:40:43 1997
From: tallen at sonic.net (Terry Allen)
Date: Mon Jun 7 16:57:33 2004
Subject: Style and read-only [was: Which style first?]
Message-ID: <199703101540.HAA19301@bolt.sonic.net>
Jon responds to Kazumi Saito:
| | About well-formed XML document which has no DTD, it is not so bad that
| | such document has style-sheet or pointer to style-sheet for
| | portability, I think.
|
| For XML this will be the typical case. Typical XML documents
| transmitted over the Web will not have DTDs and will have to point to
| (or include) one or more stylesheets that are intended to be used with
| them.
My question is perhaps off-topic here on xml-dev, and I know everyone
is busy preparing for WWW6, but I ask you all to reflect on it as
an issue that needs resolution later on: What do I do to associate
a style sheet with a read-only document, e.g., one that resides on
some other server than my own, or that has been digitally signed?
(And assume that this document has a doctype declaration already.)
Regards,
Terry Allen Electronic Publishing Consultant tallen[at]sonic.net
specializing in Web publishing, SGML, and the DocBook DTD
http://www.sonic.net/~tallen/
A Davenport Group Sponsor: http://www.ora.com/davenport/index.html
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From cbullard at hiwaay.net Mon Mar 10 16:02:05 1997
From: cbullard at hiwaay.net (Len Bullard)
Date: Mon Jun 7 16:57:33 2004
Subject: Which style first ? Re: Associating DSSSL style sheets with documents
References: <199703101520.HAA04606@boethius.eng.sun.com>
Message-ID: <33242DA5.7B78@hiwaay.net>
Jon Bosak wrote:
>
> [Kazumi Saito:]
>
> | Many of DSSSL style-sheet will be depend on DTD, I think. I want to
> | put style-sheet entity declearation on DTD.
>
> I think that it's a mistake to think of stylesheets as having a
> one-to-one relationship to DTDs. It is possible to make one
> stylesheet work with many DTDs (though this is probably not a good
> practice),
I don't see why this is the case. In practice, where stylesheets are
in use, the opposite has been the case. Many DTDs share a lot
of structures and can in fact, share stylesheets. It is an issue
of the degree of overlap. Where content tagging is practiced,
it is often convenient to take another format-oriented stylesheet
such as one might provide for HTML and use the style information
with different GIs. Where DTDs vary only in degree, the same
stylesheet is used.
One very useful side effect is organizations quickly
realize how many non-useful variations they have in their document
structures and start looking for ways to winnow these out of
their practices. One way to do that cheaply is to parse against a
DTD the organization provides. The DTD becomes a corporate
policy and a repository of corporate memory. This
is useful when attempting large rehosting or conversion projects.
The last thing I want when converting documents is a large
collection of well-formed but inscrutable markup.
> | About well-formed XML document which has no DTD, it is not so bad that
> | such document has style-sheet or pointer to style-sheet for
> | portability, I think.
>
> For XML this will be the typical case. Typical XML documents
> transmitted over the Web will not have DTDs and will have to point to
> (or include) one or more stylesheets that are intended to be used with
> them.
I'm not sure the common HTML web experience to date will be the most
predictive model for sound practice with stylesheets or DTDs. The one
example
we have, HTML does have a DTD for whatever use is made of it.
Our experience with DTDless processing was that people quickly found
it necessary or convenient to create them although they don't transmit
them often as you point out.
As people who did not formerly practice SGML will learn, unvalidated
markup is a nuisance, having a DTD is the best way to find out
what was intended by the originator of a marked up instance,
and is a rigorous expression of policy.
len bullard
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From Peter at ursus.demon.co.uk Mon Mar 10 16:57:50 1997
From: Peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun 7 16:57:33 2004
Subject: XML QuotedCData question
Message-ID: <4497@ursus.demon.co.uk>
Thanks James,
In message <2.2.32.19970310142309.00ae0294@jclark.com> James Clark writes:
> At 09:24 10/03/97 GMT, Peter Murray-Rust wrote:
>
> >Above all, of course, the XML documents must be valid SGML documents and
> >they must give the same 'result' as when processed by sgmls.
>
> You won't in general be able to parse valid XML documents with sgmls. Two
> obvious reasons are that it doesn't support Unicode and it doesn't allow you
> to change delimiters.
Understood. Is this also the same with NSGMLS? (I haven't moved on to these
simply for technical porting reasons - my UNIX machine is too old). ? In which
case if I rephrase the question to '... by SP or NSGMLS ' is this true?
P.
--
Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences
http://www.vsms.nottingham.ac.uk/
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From Peter at ursus.demon.co.uk Mon Mar 10 16:57:57 1997
From: Peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun 7 16:57:33 2004
Subject: Transmission of multiple documents
Message-ID: <4499@ursus.demon.co.uk>
In message <199703101540.HAA19301@bolt.sonic.net> Terry Allen writes:
[...]
>
> My question is perhaps off-topic here on xml-dev, and I know everyone
I think this is on-topic and is a more general problem. Please excuse
me if I'm wrong :-)
> is busy preparing for WWW6, but I ask you all to reflect on it as
> an issue that needs resolution later on: What do I do to associate
> a style sheet with a read-only document, e.g., one that resides on
> some other server than my own, or that has been digitally signed?
> (And assume that this document has a doctype declaration already.)
I interpret this to be a specific case of a more general problem - how
do we associate multiple documents delivered over the WWW? (Among the
documents are DTDs, entities of many sorts, transcluded documents,
style sheets, and methods (e.g. Java classes). I am no expert here,
but at least one person has suggested JAR files. Is this a generic solution
for this type of problem (I agree it's probably only one of several
solutions?)
P.
--
Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences
http://www.vsms.nottingham.ac.uk/
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From lee at sq.com Mon Mar 10 17:18:21 1997
From: lee at sq.com (lee@sq.com)
Date: Mon Jun 7 16:57:33 2004
Subject: XML QuotedCData question
Message-ID: <9703101717.AA18213@sqrex.sq.com>
Thanks for replying, Norbert. You are taking me a little more
literally than I meant -- you're right that macros in C are
a cleaner design than the SGML botch, and can be implemented in
a separate pass more easiy.
However,
>
>
> %DosSpecifics;
> ]>
is very like
#define DOS 1
#ifdef DOS
# include DosSpecifics
#endif
except that CPP allows general expressions there.
It turns out that more robust programming language avoid macros
altogether (e.g. C++) because there is isufficient compile-time
checking, but that doesn't really affect XML!
When I've looked at this in the past for SGML, it has seemed to me that
one coud only do partial expansion with a pre-processor.
But really I was thinking of a conceptually separate pass rather
than a completely separate one -- you'd need to have some feedback and
a shred symbol table. It may also be appropriate to treat parameter
entities and text entities quite differently -- I'm not sure.
Lee
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From lee at sq.com Mon Mar 10 17:25:17 1997
From: lee at sq.com (lee@sq.com)
Date: Mon Jun 7 16:57:33 2004
Subject: Associating DSSSL style sheets with documents
Message-ID: <9703101724.AA18420@sqrex.sq.com>
Jon wrote:
> [James Clark:]
> |
> |
>
> As the naive content producer, I like this approach.
Yes. When I mentioned using a PI in my reply to the person from
Japan (I'm sorry, I don't have your name!), this was exactly the
sort of thing I had in mind.
But then, this is more or less what Panorama does.
Of course, it'd have to be
for XML, no?
> (I assume that the value of the href can be any URL and that a browser
> that understood this syntax would cache the stylesheet just as it
> would any other recently retrieved resource.)
Yes, most browsers cache all remote resources that they fetch through URLs.
I would expect the href to be a relative/parital URL, as per James'
example, so treated as relative to the document containing the PI --
normally either the DTD or the actual body.
For Panorama, I think the PI has to be in the DTD or subset, although
it might be OK if it's before the DOCTYPE line too.
Note that this is exactly the same problem as finding the DTD, and
the same mechanisms ought to apply. Ideally, one would be able to
fetch the first/main style sheet and the DTD at the same time, for
the earliest possible display; since the DTD is optional, clearly
the style sheet code can work without it.
Lee
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From nmikula at edu.uni-klu.ac.at Mon Mar 10 19:46:54 1997
From: nmikula at edu.uni-klu.ac.at (Norbert Mikula)
Date: Mon Jun 7 16:57:33 2004
Subject: XML Parser API : complete grove 0.1
Message-ID:
I have, rather quick-hack like, put together a
Java interface for an API (to NXP) that should be able
to communicate a complete grove to an application.
In other words, an application, after parsing, should be able
to store back the document to the form it loaded it.
Please do comment on it ! It is just another piece to something
that hopfully one day will be a complete reference API to
(Java based) XML parsers.
Please have a look at :
http://www.edu.uni-klu.ac.at/~nmikula/NXP/NXP.doc/
Please note that other classes that can be accessed from there
have a lot of public methods that later on will be "privatised".
(They have not been cleaned up either....)
Best regards,
Norbert H. Mikula
=====================================================
= SGML, DSSSL, Intra- & Internet, AI, Java
=====================================================
= mailto:nmikula@edu.uni-klu.ac.at
= http://www.edu.uni-klu.ac.at/~nmikula
=====================================================
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From jjc at jclark.com Tue Mar 11 04:08:07 1997
From: jjc at jclark.com (James Clark)
Date: Mon Jun 7 16:57:33 2004
Subject: XML QuotedCData question
Message-ID: <2.2.32.19970311035738.00676784@jclark.com>
At 16:22 10/03/97 GMT, Peter Murray-Rust wrote:
>Thanks James,
>
>In message <2.2.32.19970310142309.00ae0294@jclark.com> James Clark writes:
>> At 09:24 10/03/97 GMT, Peter Murray-Rust wrote:
>>
>> >Above all, of course, the XML documents must be valid SGML documents and
>> >they must give the same 'result' as when processed by sgmls.
>>
>> You won't in general be able to parse valid XML documents with sgmls. Two
>> obvious reasons are that it doesn't support Unicode and it doesn't allow you
>> to change delimiters.
>
>Understood. Is this also the same with NSGMLS?
No.
> (I haven't moved on to these
>simply for technical porting reasons - my UNIX machine is too old). ? In which
>case if I rephrase the question to '... by SP or NSGMLS ' is this true?
Yes, with an appropriate SGML declaration. However you won't be able to
parse well-formed but invalid XML documents, and you won't be able to
validate XML documents for constraints that are in XML but not in SGML.
James
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From jjc at jclark.com Tue Mar 11 04:08:31 1997
From: jjc at jclark.com (James Clark)
Date: Mon Jun 7 16:57:33 2004
Subject: Associating DSSSL style sheets with documents
Message-ID: <2.2.32.19970311035745.00aecf3c@jclark.com>
At 12:24 10/03/97 EST, lee@sq.com wrote:
>Jon wrote:
>
>> [James Clark:]
>> |
>> |
>>
>> As the naive content producer, I like this approach.
>Yes. When I mentioned using a PI in my reply to the person from
>Japan (I'm sorry, I don't have your name!), this was exactly the
>sort of thing I had in mind.
>
>But then, this is more or less what Panorama does.
>
>Of course, it'd have to be
>
>for XML, no?
Well, this is something that is applicable to SGML in general not just to
XML. Since
>I would expect the href to be a relative/parital URL, as per James'
>example, so treated as relative to the document containing the PI --
>normally either the DTD or the actual body.
Right. It would be relative to the entity containing the PI.
James
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From ht at cogsci.ed.ac.uk Tue Mar 11 09:10:22 1997
From: ht at cogsci.ed.ac.uk (Henry S. Thompson)
Date: Mon Jun 7 16:57:33 2004
Subject: Style and read-only [was: Which style first?]
In-Reply-To: Terry Allen's message of Mon, 10 Mar 1997 07:40:21 -0800
References: <199703101540.HAA19301@bolt.sonic.net>
Message-ID: <2068.199703110910@grogan.cogsci.ed.ac.uk>
> My question is perhaps off-topic here on xml-dev, and I know everyone
> is busy preparing for WWW6, but I ask you all to reflect on it as
> an issue that needs resolution later on: What do I do to associate
> a style sheet with a read-only document, e.g., one that resides on
> some other server than my own, or that has been digitally signed?
> (And assume that this document has a doctype declaration already.)
Create a stub document with the SAME DTD which has a single top-level
element which replaces itself (using XML-LINK) with the document you
care about.
Or if you don't like links, like this
]>
&rod;
and in either case associate the style sheet with your stub in
whatever way we end up agreeing on.
ht
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From dgd at cs.bu.edu Tue Mar 11 15:46:45 1997
From: dgd at cs.bu.edu (David Durand)
Date: Mon Jun 7 16:57:33 2004
Subject: Style and read-only [was: Which style first?]
In-Reply-To: <2068.199703110910@grogan.cogsci.ed.ac.uk>
References: Terry Allen's message of Mon, 10 Mar 1997 07:40:21 -0800
<199703101540.HAA19301@bolt.sonic.net>
Message-ID:
At 9:10 AM +0000 3/11/97, Henry S. Thompson wrote:
And Terry Allen wrote:
>> My question is perhaps off-topic here on xml-dev, and I know everyone
>> is busy preparing for WWW6, but I ask you all to reflect on it as
>> an issue that needs resolution later on: What do I do to associate
>> a style sheet with a read-only document, e.g., one that resides on
>> some other server than my own, or that has been digitally signed?
>> (And assume that this document has a doctype declaration already.)
First, I want to observe that Terry's point is very important... So we
really need to address it. It cuts to the core of why stylesheet
information needs to be loosely bound to a document. While I think that
binding style information into documents at all is a short-sighted
practice, what is more important is the ability to bind _new_ style
information onto the document _later_. If you have that you can always
ignore old, useless, or unwanted styles that are packaged with a document.
>Create a stub document with the SAME DTD which has a single top-level
>element which replaces itself (using XML-LINK) with the document you
>care about.
This should not work, as linking should cause stylesheet replacement -- and
adding stylesheet semantics to links is worse.
>Or if you don't like links, like this
>
>
>
>]>
>
>&rod;
>
This doesn't work when &rod; contains a -- which was exactly
Terry's point.
I think that CATALOG-based proposals may be the best way to accommodate
such needs. Everything proposed for the style PI could fit as easily into
the catalog, and be more general, and less-tightly bound to the document.
>and in either case associate the style sheet with your stub in
>whatever way we end up agreeing on.
The problem is that you may not be able to create such a stub.
Here's the (practical) stylesheet problem the really bothers me:
HTTP 1.0 uses single connections per resource, and even HTTP 1.1 sends
resources serially down the wire, although it can re-use the connection.
This means that it will be hard to do incremental display of XML documents
unless we can get the stylesheet coming down the wire _before_ the document
itself. This seems problematic on several counts.
Since HTTP 1.1 is meant to make multiple connections to the same serer
unnecessary, the easy fix is ruled out by good network citizenship.
This is another place where getting a CATALOG could tell you quickly what
resources need to be fetched, and would let you get them in the right
order. I know that we hope that many stylesheets will be cached at the
client, but we can't count on that, especially from what I think I remember
about cache coherence on the Web.
-- David
-- David
_________________________________________
David Durand dgd@cs.bu.edu \ david@dynamicDiagrams.com
Boston University Computer Science \ Sr. Analyst
http://www.cs.bu.edu/students/grads/dgd/ \ Dynamic Diagrams
--------------------------------------------\ http://dynamicDiagrams.com/
MAPA: mapping for the WWW \__________________________
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From tallen at sonic.net Tue Mar 11 15:48:14 1997
From: tallen at sonic.net (Terry Allen)
Date: Mon Jun 7 16:57:33 2004
Subject: Style and read-only [was: Which style first?]
Message-ID: <199703111548.HAA13791@bolt.sonic.net>
Henry Thompson writes in response to me:
| > My question is perhaps off-topic here on xml-dev, and I know everyone
| > is busy preparing for WWW6, but I ask you all to reflect on it as
| > an issue that needs resolution later on: What do I do to associate
| > a style sheet with a read-only document, e.g., one that resides on
| > some other server than my own, or that has been digitally signed?
| > (And assume that this document has a doctype declaration already.)
|
| Create a stub document with the SAME DTD which has a single top-level
| element which replaces itself (using XML-LINK) with the document you
| care about.
Then the top-level element has to be a linking element, which is
not true of most DTDs. But creating your own document is necessary,
I think; it may have to be an instance of a DTD that defines the
relations among the things pointed to. The other approach I can
think of is a MIME type constructed for the purpose.
| Or if you don't like links, like this
|
|
|
| ]>
|
| &rod;
|
|
| and in either case associate the style sheet with your stub in
| whatever way we end up agreeing on.
That won't work if the read-only document has a doctype declaration,
unless XML allows multiple doctype declarations (or I'm missing
something).
Regards,
Terry Allen Electronic Publishing Consultant tallen[at]sonic.net
specializing in Web publishing, SGML, and the DocBook DTD
http://www.sonic.net/~tallen/
A Davenport Group Sponsor: http://www.ora.com/davenport/index.html
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From richard at cogsci.ed.ac.uk Tue Mar 11 17:22:47 1997
From: richard at cogsci.ed.ac.uk (Richard Tobin)
Date: Mon Jun 7 16:57:34 2004
Subject: XML QuotedCData question
Message-ID: <199703111722.RAA22137@deacon.cogsci.ed.ac.uk>
I have another couple of questions about quoted cdata.
(1) How should &foo!bar; be interpreted? According to the BNF it is
completely valid, and not a reference. This seems undesirable from
the point of view of human readability.
(2) Why is left angle bracket excluded from quoted cdata?
(3) Is the answer to (1) and (2) that it really should be ampersand
that is excluded?
-- Richard
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From bosak at atlantic-83.Eng.Sun.COM Fri Mar 14 07:17:54 1997
From: bosak at atlantic-83.Eng.Sun.COM (Jon Bosak)
Date: Mon Jun 7 16:57:34 2004
Subject: XML parsers hit the big time
Message-ID: <199703140716.XAA07782@boethius.eng.sun.com>
Congratulations, XML implementors! You've just become strategic to a
big industry initiative!
>From Microsoft's press release announcing the Channel Definition Format
(http://www.microsoft.com/corpinfo/press/1997/Mar97/Cdfrpr.htm):
CDF will be easy for Web developers to adopt because it is based
on XML, which has support among many third parties. XML has public
domain software written in Java and other languages available now
that can be used to parse CDF files. The CDF specification
submission extends XML and Web Collections work that the W3C has
in progress. These efforts will allow for open, HTML-based Web
broadcasting based on standards-based technologies that are
expected to have strong support among W3C members. Microsoft looks
forward to other leading Web developers joining in support of this
open standards effort.
Jon
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From richard at cogsci.ed.ac.uk Fri Mar 14 18:05:00 1997
From: richard at cogsci.ed.ac.uk (richard@cogsci.ed.ac.uk)
Date: Mon Jun 7 16:57:34 2004
Subject: References in default attribute values
Message-ID: <29787.199703141804@pitcairn.cogsci.ed.ac.uk>
If a default for an attribute value contains an entity reference, must
the entity be declared before attribute list declaration? I cannot
see such a requirement, and I find this surprising since there *is*
such a requirement for (parameter) references in entity declarations.
-- Richard
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From elm at arbortext.com Fri Mar 14 18:14:28 1997
From: elm at arbortext.com (Eve L. Maler)
Date: Mon Jun 7 16:57:34 2004
Subject: Associating DSSSL style sheets with documents
Message-ID: <3.0.16.19970314131001.35af3f68@village.doctools.com>
At 10:57 AM 3/11/97 +0700, James Clark wrote:
>At 12:24 10/03/97 EST, lee@sq.com wrote:
>>Of course, it'd have to be
>>
>>for XML, no?
>
>Well, this is something that is applicable to SGML in general not just to
>XML. Since would rather use simply browser should probably make the keyword user configurable.
This is interesting: Should an XML effort determine a PI that should be
usable in general by SGML documents? I tend to think that the "authority"
that invents/maintains the format of the PI should be identified, and "XML"
sort of fits the bill, similarly to . This way, "
Message-ID: <3329A2AF.2906@hiwaay.net>
Eve L. Maler wrote:
>
> I've also been beating the drum on the WG list about how our PIs should
> have "GIs" as well as "attribute specs," so I'd prefer to see stylesheet att1="val1" att2="val2"... ?>. This way, " so that it will be processed by an XML-aware processor, and the rest
> identifies the semantics of the instruction.
This looks weirdly like DTD/instance built into the XML instance.
So, XML then defines an application inside the instance?
I understand it because this is how IADS and IDE/AS did links
originally. However, it created interoperability problems
and does to this day. What is the difference between this
and a tag bag of empty elements included at the top of a DTD?
len
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From jenglish at crl.com Fri Mar 14 19:49:56 1997
From: jenglish at crl.com (Joe English)
Date: Mon Jun 7 16:57:34 2004
Subject: References in default attribute values
In-Reply-To: <29787.199703141804@pitcairn.cogsci.ed.ac.uk>
References: <29787.199703141804@pitcairn.cogsci.ed.ac.uk>
Message-ID: <199703141948.AA01527@mail.crl.com>
richard@cogsci.ed.ac.uk wrote:
> If a default for an attribute value contains an entity reference, must
> the entity be declared before attribute list declaration? I cannot
> see such a requirement, and I find this surprising since there *is*
> such a requirement for (parameter) references in entity declarations.
I'm not positive about the rules in XML, but in SGML
it _is_ necessary for the general entity declaration
to appear first, as near as I can tell. (SGMLS agrees)
By productions [143], [147], [33], and [34], the default
value in an attribute definition is parsed as replaceable
character data, which means that general entity references
are recognized and replaced, and the rule that entities
must be declared before they are referenced applies.
[ Another, erm, interesting fact is that parameter entity
references are _not_ replaced in attribute value literals
in ATTLIST declarations. E.g.:
A1 CDATA #FIXED %e1; -- and here --
A2 CDATA #FIXED "%e1;" -- but not here! --
>
I've been fooled by this more than once... ]
--Joe English
jenglish@crl.com
[143] attribute definition (11.3, 421:1) =
( attribute name [144],
+ps [65],
declared value [145],
+ps [65],
default value [147] )
[147] default value (11.3.4, 425:1) =
( ( ?( rni ("#"),
"FIXED",
+ps [65] ),
attribute value specification [33] )
| ( rni ("#"),
( "REQUIRED"
| "CURRENT"
| "CONREF"
| "IMPLIED" ) ) )
[33] attribute value specification (7.9.3, 331:1) =
( attribute value [35]
| attribute value literal [34] )
[34] attribute value literal (7.9.3, 331:4) =
( ( lit ("\""),
replaceable character data [46],
lit ("\"") )
| ( lita ("'"),
replaceable character data [46],
lita ("'") ) )
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From elm at arbortext.com Fri Mar 14 20:21:59 1997
From: elm at arbortext.com (Eve L. Maler)
Date: Mon Jun 7 16:57:34 2004
Subject: Associating DSSSL style sheets with documents
Message-ID: <3.0.16.19970314151234.1c5f0bce@village.doctools.com>
At 01:10 PM 3/14/97 -0600, Len Bullard wrote:
>Eve L. Maler wrote:
>>
>> I've also been beating the drum on the WG list about how our PIs should
>> have "GIs" as well as "attribute specs," so I'd prefer to see > stylesheet att1="val1" att2="val2"... ?>. This way, "> so that it will be processed by an XML-aware processor, and the rest
>> identifies the semantics of the instruction.
>
>This looks weirdly like DTD/instance built into the XML instance.
>So, XML then defines an application inside the instance?
>
>I understand it because this is how IADS and IDE/AS did links
>originally. However, it created interoperability problems
>and does to this day. What is the difference between this
>and a tag bag of empty elements included at the top of a DTD?
>
>len
The difference is that, by convention, you're making PI markup available
that's available to every document and to every *location* in a document if
necessary, no matter what its DTD (and no matter whether it even has one).
It just happens to look suspiciously like a start-tag, which may be helpful
to any software that has to parse the PI string.
I don't think links in general should be done this way, but I do believe in
PIs being used for, uh, instructions to processors. (In other words, I'm
not 100% against PIs, as some people are.) In particular, I'm starting to
get very fond of PIs for anything that has to be specified per entity.
Eve
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From bosak at atlantic-83.Eng.Sun.COM Fri Mar 14 20:32:52 1997
From: bosak at atlantic-83.Eng.Sun.COM (Jon Bosak)
Date: Mon Jun 7 16:57:34 2004
Subject: Revised workshop: "XML: Where do we go from here?"
Message-ID: <199703142031.MAA08159@boethius.eng.sun.com>
The WWW6 workshop formerly titled "Delivery of Structured Documents
over the Web" has been reorganized in light of recent announcements.
Already driven by a rapidly growing set of implementations developed
by individual experimenters, XML reached critical mass with the
announcement this week of industry initiatives using XML as an
enabling technology [1,2,3]. Now that XML seems assured a place in
the pantheon of Internet standards, the question is, where do we go
from here?
This workshop will explore a variety of topics based on the interests
of people actively working with XML. Representative topics include:
APIs for XML parsers
The role of Java in XML
Is the grove concept helpful?
Enabling a new authoring experience
XML and Web objects
XML stylesheets: CSS, DSSSL, or both?
XML/HTML integration
The workshop format will be a series of short presentations, one per
participant, with a period of discussion following each presentation.
The workshop will begin with a review of recent developments and an
orientation to the larger picture that includes XML syntax, XML
linking, scripting languages, and stylesheets.
The purpose of this workshop is to explore future directions and offer
XML experimenters an opportunity to exchange ideas and experiences.
There are still places available in this workshop for qualified
participants. Note that you must register for the workshop separately
from the rest of the conference; for details, see
http://www6conf.slac.stanford.edu/
If you are interested in participating in the XML workshop, please
send a 1-2 paragraph summary of a topic that you would like to present
to
jon.bosak@sun.com
The workshop materials are due immediately, so a response by Monday
morning, March 16, is required for presentations that will be archived
on the conference CD.
[1] http://www.w3.org/pub/WWW/Submission/1997/2/Overview.html
[2] http://www.w3.org/pub/WWW/Submission/1997/3/Overview.html
[3] http://www.microsoft.com/corpinfo/press/1997/Mar97/Cdfrpr.htm
----------------------------------------------------------------------
Jon Bosak, Online Information Technology Architect, Sun Microsystems
----------------------------------------------------------------------
2550 Garcia Ave., MPK17-101, Mountain View, California 94043
Davenport Group::SGML Open::NCITS V1::ISO/IEC JTC1/SC18/WG8::W3C XML
If a man look sharply and attentively, he shall see Fortune; for
though she be blind, yet she is not invisible. -- Francis Bacon
----------------------------------------------------------------------
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From cbullard at hiwaay.net Fri Mar 14 21:05:37 1997
From: cbullard at hiwaay.net (Len Bullard)
Date: Mon Jun 7 16:57:34 2004
Subject: Associating DSSSL style sheets with documents
References: <3.0.16.19970314151234.1c5f0bce@village.doctools.com>
Message-ID: <3329BAE0.3CC1@hiwaay.net>
Eve L. Maler wrote:
>
>
> The difference is that, by convention, you're making PI markup available
> that's available to every document and to every *location* in a document if
> necessary, no matter what its DTD (and no matter whether it even has one).
> It just happens to look suspiciously like a start-tag, which may be helpful
> to any software that has to parse the PI string.
By convention? You mean, by application.
An inclusion on root makes an empty element available
to every location. A PI is something every document has to have.
That isn't an improvement. If you use a DOCTYPE and know the DTD,
don't
you get the same effect? XML goes out it's way to load up an
instance just to get around a DTD. I question the utility of that.
We tell them they are being freed of fixed markup, then add a
question mark and say, oh, that's OK, that's XML.
> I don't think links in general should be done this way, but I do believe in
> PIs being used for, uh, instructions to processors.
Ummm... sure. Sort of what links are.
> (In other words, I'm
> not 100% against PIs, as some people are.) In particular, I'm starting to
> get very fond of PIs for anything that has to be specified per entity.
No doubt.
len
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From elm at arbortext.com Fri Mar 14 22:39:24 1997
From: elm at arbortext.com (Eve L. Maler)
Date: Mon Jun 7 16:57:34 2004
Subject: Associating DSSSL style sheets with documents
Message-ID: <3.0.16.19970314173449.1c972e2e@village.doctools.com>
At 02:53 PM 3/14/97 -0600, Len Bullard wrote:
>Eve L. Maler wrote:
>>
>>
>> The difference is that, by convention, you're making PI markup available
>> that's available to every document and to every *location* in a document if
>> necessary, no matter what its DTD (and no matter whether it even has one).
>> It just happens to look suspiciously like a start-tag, which may be helpful
>> to any software that has to parse the PI string.
>
>By convention? You mean, by application.
I'm not sure I catch your distinction. If we agree on a meaning and a
syntax for it, we've made a convention. (Like when everyone asks "How are
you?" and expects a short, positive answer. :-) Applications can now
predictably act on the usage of the convention. (Like when someone starts
to walk away after a moment, safe -- usually! -- in the assumption that the
other person just answered "I'm fine.")
>An inclusion on root makes an empty element available
>to every location. A PI is something every document has to have.
>That isn't an improvement. If you use a DOCTYPE and know the DTD,
>don't
>you get the same effect? XML goes out it's way to load up an
>instance just to get around a DTD. I question the utility of that.
>We tell them they are being freed of fixed markup, then add a
>question mark and say, oh, that's OK, that's XML.
But XML doesn't have inclusions, and any one document may not even have
DTDs. So your "ifs" sometimes don't come true. I agree that we don't want
to push legitimate DTD functions into PIs, which give you a lot less
validation power. But processing instructions (in the regular English
sense) don't belong in the normal markup scheme most of the time.
>> I don't think links in general should be done this way, but I do believe in
>> PIs being used for, uh, instructions to processors.
>
>Ummm... sure. Sort of what links are.
Well, a reference to a stylesheet is surely a link, but not all links are
references to stylesheets. Also, not all processing instructions are links
to something. Do you think PIs are never appropriate?
>> (In other words, I'm
>> not 100% against PIs, as some people are.) In particular, I'm starting to
>> get very fond of PIs for anything that has to be specified per entity.
>
>No doubt.
>
>len
Eve
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From cbullard at hiwaay.net Fri Mar 14 23:26:21 1997
From: cbullard at hiwaay.net (Len Bullard)
Date: Mon Jun 7 16:57:34 2004
Subject: Associating DSSSL style sheets with documents
References: <3.0.16.19970314173449.1c972e2e@village.doctools.com>
Message-ID: <3329DBE3.787B@hiwaay.net>
Eve L. Maler wrote:
>
> At 02:53 PM 3/14/97 -0600, Len Bullard wrote:
> >Eve L. Maler wrote:
> >>
> >>
> >> The difference is that, by convention, you're making PI markup available
> >> that's available to every document and to every *location* in a document if
> >> necessary, no matter what its DTD (and no matter whether it even has one).
> >> It just happens to look suspiciously like a start-tag, which may be helpful
> >> to any software that has to parse the PI string.
> >
> >By convention? You mean, by application.
>
> I'm not sure I catch your distinction. If we agree on a meaning and a
> syntax for it, we've made a convention.
If we agree on a convention, one of us can break it at any time
without a serious penalty. If we make a contract, either can
enforce it. The PI is a contract. So is the DTD we're
trying to avoid with a hack.
> But XML doesn't have inclusions, and any one document may not even have
> DTDs. So your "ifs" sometimes don't come true. I agree that we don't want
> to push legitimate DTD functions into PIs, which give you a lot less
> validation power. But processing instructions (in the regular English
> sense) don't belong in the normal markup scheme most of the time.
Then why are they in the data? Why were they deprecated? What
is in the SGML Way that is being overlooked here? Why is it
being overlooked? Which is wrong: the SGML Way or the use of PIs?
IOW, what the PIs you suggest do is put metainformation inside
an instance. Why? What is it they will convey that an XML engine
will not already know by reading the specification or could know
by reading a DTD? Is the DTD not there simply because members
of the Working Group don't want them to be but now can't find
a way to get around the functionality they provided?
> Well, a reference to a stylesheet is surely a link, but not all links are
> references to stylesheets. Also, not all processing instructions are links
> to something. Do you think PIs are never appropriate?
I didn't say that. I'm wondering why they are suddenly a preferred
practice when they were formerly a deprecated practice? What is
worse, a DTD I send once and might be very small, or PIs I send
every time?
len
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From tbray at textuality.com Fri Mar 14 23:36:28 1997
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun 7 16:57:34 2004
Subject: Associating DSSSL style sheets with documents
Message-ID: <3.0.32.19970314153455.009a2ae0@pop.intergate.bc.ca>
At 05:14 PM 3/14/97 -0600, Len Bullard wrote:
>IOW, what the PIs you suggest do is put metainformation inside
>an instance. Why? What is it they will convey that an XML engine
>will not already know by reading the specification or could know
>by reading a DTD?
Well, a DTD, considered as metadata, is pretty thin. It doesn't
contain any semantic information, nor much in the way of strong
data typing. I can't think of much that is useful for downstream
processing that would naturally live inside a DTD. The problem of
packaging, of tying the things that you *do* need (stylesheets,
topical metadata, typing rules) to documents is a real one and worth
spending time on. But there is no reason to believe that a DTD
is a very important part of such a solution.
Secondly, the distinction between data and metadata is, at a deep
level, bogus; totally in the eye of the beholder. For this reason,
it is always good and never bad to make what the author may
consider metadata available along with what the author
considers data. Because the author is usually wrong.
>[re PI's:] I'm wondering why they are suddenly a preferred
>practice when they were formerly a deprecated practice? What is
>worse, a DTD I send once and might be very small, or PIs I send
>every time?
Reasonable people may disagree. I have no trouble in saying that
I think that PIs are a useful thing, and a necessary part of real-world
document processing. Thus, yes (gasp) I disagree with the language
in the SGML standard deprecating PIs, and I see no reason for us
to consider ourselves bound by it.
As for once vs. many, I think that it is in general A Good Thing
for documents on the web to be self-contained whenever possible.
And while I think it is indeed smart to try to avoid retransmitting
fixed ancillary files (metadata, stylesheets, whatever), I don't
think that this class of files includes DTDs that often for the
downstream processing tasks I've seen. - Tim
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From Peter at ursus.demon.co.uk Fri Mar 14 23:47:10 1997
From: Peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun 7 16:57:34 2004
Subject: XML parsers hit the big time
Message-ID: <4640@ursus.demon.co.uk>
In message <199703140716.XAA07782@boethius.eng.sun.com> bosak@atlantic-83.Eng.Sun.COM (Jon Bosak) writes:
> Congratulations, XML implementors! You've just become strategic to a
> big industry initiative!
[...notice of CDF release...]
I'd like to welcome the involvement of commercial developers and extend a
special welcome to any newcomers to contribute to the list. [We appreciate
that involvement in an XML-project may be sensitive and that you may not
wish to publicise this.] I think it's particularly important that fuzzy
areas of the spec are discussed, because none of us want 'very slightly
different versions of XML'.
P.
--
Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences
http://www.vsms.nottingham.ac.uk/
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From Peter at ursus.demon.co.uk Fri Mar 14 23:47:13 1997
From: Peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun 7 16:57:34 2004
Subject: Indexing of XML documents
Message-ID: <4641@ursus.demon.co.uk>
I hope I can express this problem clearly - I'm sure that you are
familiar with it.
When we need to resolve a TEI pointer like (id a23) we may have to scan
the whole document. In general we will wish to cache (index) IDs since
we don't wish to rescan for another search. One obvious place to do this
is when the document is first read in (admittedly there may never be a need
to scan the whole document).
When validating a document the IDs, GIs and ATTNAMEs all have to be scanned
since they occur in VC's. Presumably as a by-product of validation we can
at least expect a hashtable of IDs (and possibly GIs).
The question is, should we do both of these by default (or even others
that I haven't thought of)? Or should we do none and leave it to the app?
Or should the parser have a switch?
P.
[BTW a WF document can have multiple identical IDs, OK? Presumably the
behaviour of an app that has to reference them is 'undefined'?]
--
Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences
http://www.vsms.nottingham.ac.uk/
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From lee at sq.com Sat Mar 15 02:24:16 1997
From: lee at sq.com (lee@sq.com)
Date: Mon Jun 7 16:57:34 2004
Subject: Indexing of XML documents
Message-ID: <9703150224.AA05729@sqrex.sq.com>
> When we need to resolve a TEI pointer like (id a23) we may have to scan
> the whole document.
This all depends on who "we" is taken to be.
A web indexing robot doesn't need to resolve tei pointers at all,
except to identify the remote document -- it then indexes the whole thing.
> In general we will wish to cache (index) IDs since
> we don't wish to rescan for another search.
I don't follow this. Under what circumstances is searching a document for
an ID much more painful than using a cache? Is this for 100 MByte documents?
(which do exist, by the way, droves. No, like elephants, in herds)
> When validating a document the IDs, GIs and ATTNAMEs all have to be scanned
> since they occur in VC's.
Not sure what a VC is (validatable context??) but yes, they all have to
be validated.
> Presumably as a by-product of validation we can
> at least expect a hashtable of IDs (and possibly GIs).
I think that should be application-specific.
You might provide a hash table interface to make it easier, though.
Lee
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From cbullard at hiwaay.net Sat Mar 15 04:11:02 1997
From: cbullard at hiwaay.net (len bullard)
Date: Mon Jun 7 16:57:34 2004
Subject: Associating DSSSL style sheets with documents
References: <3.0.32.19970314153455.009a2ae0@pop.intergate.bc.ca>
Message-ID: <332A2143.7AE3@hiwaay.net>
Tim Bray wrote:
>
> Well, a DTD, considered as metadata, is pretty thin. It doesn't
> contain any semantic information, nor much in the way of strong
> data typing. I can't think of much that is useful for downstream
> processing that would naturally live inside a DTD.
You must not do much conversion work. It is awfully handy the
first time one sees the document instance. It sure is the
cheap way (say, non-programming) to figure out what the
intended structure is.
> The problem of
> packaging, of tying the things that you *do* need (stylesheets,
> topical metadata, typing rules) to documents is a real one and worth
> spending time on. But there is no reason to believe that a DTD
> is a very important part of such a solution.
Tieing stylesheets, no you are right. Topical metadata, typing rules,
I'm not so sure. Most of SGML practice to date works something
like that.
> Secondly, the distinction between data and metadata is, at a deep
> level, bogus; totally in the eye of the beholder.
To some extent, that is true.
> For this reason,
> it is always good and never bad to make what the author may
> consider metadata available along with what the author
> considers data. Because the author is usually wrong.
If the source system/author is wrong, this whole XML thing
is suddenly bogus. It is a matter of how one wants to
package the data. I think using #FIXED attributes works
pretty well. There are some awfully good HyTime browsers
out there. Ask Fujitsu about the one they have. The CaPH
folks and Biezunski might disagree as well.
> >[re PI's:] I'm wondering why they are suddenly a preferred
> >practice when they were formerly a deprecated practice? What is
> >worse, a DTD I send once and might be very small, or PIs I send
> >every time?
>
> Reasonable people may disagree.
We are all reasonable people.
That doesn't answer the second question. Why send PIs every
time if what I need to know is in a public specification,
eg, a DTD?
> I have no trouble in saying that
> I think that PIs are a useful thing, and a necessary part of real-world
> document processing. Thus, yes (gasp) I disagree with the language
> in the SGML standard deprecating PIs, and I see no reason for us
> to consider ourselves bound by it.
Oh, that part I agree with. We've used PIs quite a bit
even before they were cool. So does Arbortext. Only
the religious among us don't.
> As for once vs. many, I think that it is in general A Good Thing
> for documents on the web to be self-contained whenever possible.
Sure. And when not possible, it's nice to validate.
> And while I think it is indeed smart to try to avoid retransmitting
> fixed ancillary files (metadata, stylesheets, whatever), I don't
> think that this class of files includes DTDs that often for the
> downstream processing tasks I've seen. - Tim
I like to have a generalized editor that works first time and
every time. I hate having to keep fifteen of them for chores
that overlap. They are easier to write than parsers, even
XML parsers.
Anyway, the DTD is easier to explain than the PIs, and I
get to build them as I need when I need. A set of XML
PIs and a set of ArborText PIs work out to be the same
thing: fized process flags.
or
are both value-pair lists. I don't see the
difference except that for
I can always build
turn on a free parser and find out what I have
without hiring a CS grad to find out for me.
When the data is ten years old, that has
advantages... waaaaaay downstream
len
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From bosak at atlantic-83.Eng.Sun.COM Sat Mar 15 04:49:26 1997
From: bosak at atlantic-83.Eng.Sun.COM (Jon Bosak)
Date: Mon Jun 7 16:57:35 2004
Subject: XML demos for Developer's Day
Message-ID: <199703150447.UAA08810@boethius.eng.sun.com>
Here (in no particular order) is the list of demos I have lined up for
the implementor's session in the XML track on Developer's Day at the
World Wide Web conference. Some of these are tentative, depending on
whether the project in question is actually running by Developer's Day
(Friday, April 11). Please let me know if I've gotten anything wrong
or left anyone out.
Sun Microsystems XML Web site
ICL XML server
ArborText XML editor
Inso XML converter, XML Web server, XML local browser
RivCom XML Netscape plug-in
Univ. of Edinburgh XML tools, DSSSL syntax checker
Open Molecule Fndtn. XML processor/renderer
Fujitsu Laboratories XML/DSSSL browser
Kevin Grimes XML processor
Tim Bray XML parser
Norbert Mikula XML parser, DSSSL engine
----------------------------------------------------------------------
Jon Bosak, Online Information Technology Architect, Sun Microsystems
----------------------------------------------------------------------
2550 Garcia Ave., MPK17-101, Mountain View, California 94043
Davenport Group::SGML Open::NCITS V1::ISO/IEC JTC1/SC18/WG8::W3C XML
If a man look sharply and attentively, he shall see Fortune; for
though she be blind, yet she is not invisible. -- Francis Bacon
----------------------------------------------------------------------
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From jjc at jclark.com Sat Mar 15 08:51:39 1997
From: jjc at jclark.com (James Clark)
Date: Mon Jun 7 16:57:35 2004
Subject: Associating DSSSL style sheets with documents
Message-ID: <2.2.32.19970315084050.0105f4f0@jclark.com>
At 13:10 14/03/97 -0500, Eve L. Maler wrote:
>At 10:57 AM 3/11/97 +0700, James Clark wrote:
>>At 12:24 10/03/97 EST, lee@sq.com wrote:
>>>Of course, it'd have to be
>>>
>>>for XML, no?
>>
>>Well, this is something that is applicable to SGML in general not just to
>>XML. Since >would rather use simply >browser should probably make the keyword user configurable.
>
>This is interesting: Should an XML effort determine a PI that should be
>usable in general by SGML documents?
I wasn't proposing that *XML* define such a PI. All I was just suggesting
was that people who have DSSSL engines implement it (preferably making the
name of the PI configurable).
>I tend to think that the "authority"
>that invents/maintains the format of the PI should be identified, and "XML"
>sort of fits the bill, similarly to token in a PI functions as a sort of notation. It would be weird for an
>XML spec to specify
>I've also been beating the drum on the WG list about how our PIs should
>have "GIs" as well as "attribute specs," so I'd prefer to see stylesheet att1="val1" att2="val2"... ?>. This way, "so that it will be processed by an XML-aware processor, and the rest
>identifies the semantics of the instruction.
I disagree. XML requires that all PIs start with a name, and says that this
name is normally the name of a declared notation. So I think PIs should
look like
(Note that the currently-defined XML PI fits this pattern not the one you
suggest.) The authority should come from the public identifier on the
notation declaration for name. Since XML reserves all names beginning with
XML-, I would think that an XML-defined PI should look like:
James
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From Peter at ursus.demon.co.uk Sat Mar 15 18:32:18 1997
From: Peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun 7 16:57:35 2004
Subject: Indexing of XML documents
Message-ID: <4670@ursus.demon.co.uk>
In message <9703150224.AA05729@sqrex.sq.com> lee@sq.com writes:
> > When we need to resolve a TEI pointer like (id a23) we may have to scan
> > the whole document.
>
> This all depends on who "we" is taken to be.
>
> A web indexing robot doesn't need to resolve tei pointers at all,
> except to identify the remote document -- it then indexes the whole thing.
I am guilty of imprecision ( sorry :-) I meant an internal indexing of the
document tree, not an index to locate the document.
>
> > In general we will wish to cache (index) IDs since
> > we don't wish to rescan for another search.
> I don't follow this. Under what circumstances is searching a document for
> an ID much more painful than using a cache? Is this for 100 MByte documents?
> (which do exist, by the way, droves. No, like elephants, in herds)
Yes - I was thinking of exactly that. Particularly if the document contains
thousands of elements (e.g. large chunks of HTML-like material).
>
> > When validating a document the IDs, GIs and ATTNAMEs all have to be scanned
> > since they occur in VC's.
> Not sure what a VC is (validatable context??) but yes, they all have to
> be validated.
VC = 'validity constraint' - see XML-draft 1.4 and abbreviated as this in
later places. The point is that (say) in production 52 all IDs have to be
scanned for uniqueness. Therefore at this stage it could be useful to
hash them so that they could be extracted rapidly if they form part of a
later search, rather than going through the whole doc again.
It's no big deal - but since I found myself doing it for various
searches, it seemed worth thinking about in the API.
P.
--
Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences
http://www.vsms.nottingham.ac.uk/
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From sgmlsh at CAM.ORG Sat Mar 15 18:41:05 1997
From: sgmlsh at CAM.ORG (Sam Hunting)
Date: Mon Jun 7 16:57:35 2004
Subject: Associating DSSSL style sheets with documents
In-Reply-To: <332A2143.7AE3@hiwaay.net>
Message-ID:
> Because the author is usually wrong.
All Cretans are liars?
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From bosak at atlantic-83.Eng.Sun.COM Sun Mar 16 02:35:49 1997
From: bosak at atlantic-83.Eng.Sun.COM (Jon Bosak)
Date: Mon Jun 7 16:57:35 2004
Subject: XML demos for Developer's Day
In-Reply-To: <2.2.32.19970316010332.00734140@sover.net> (message from Liora Alschuler on Sat, 15 Mar 1997 20:03:32 -0500)
Message-ID: <199703160234.SAA09440@boethius.eng.sun.com>
[Liora Alschuler:]
| I would like to include this list in my coverage of the xml conf in
| terms of what was shown in San Diego and what will be in Santa
| Clara. Anyone object to the mention? Jon?
I would very much prefer not to see this publicized right now. As I
said, the list is tentative. One of the reasons I posted it was to
see where everyone is in their planning right now. Some of the
experimenters won't know until Thursday night whether they will have
something to show on Friday. I would hate to see us publicize a list
and then have some of the promised demos not occur. I would prefer to
just say that there will be a demo session and let the rest be a
surprise.
Jon
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From Ingo.Macherius at tu-clausthal.de Sun Mar 16 03:44:16 1997
From: Ingo.Macherius at tu-clausthal.de (Ingo Macherius)
Date: Mon Jun 7 16:57:35 2004
Subject: MBone
Message-ID: <199703160343.EAA06818@kneipfix.rz.tu-clausthal.de>
This is a bit off topic, sorry.
I don't have the opportunity to go to WWW6, but luckily I have MBone
connection and see there are four channels prepared. Unluckily I tested
transmissions from the US and found the Germany-USA link insufficient
to deliver understandable speech. So my question is, whether the sessions
are recorded and avaliable for download. This would enable me to watch.
Second question: What's the broadcast schedule for the XML sessions ?
Thanks in advance.
++im
--
Snail : Ingo Macherius // L'Aigler Platz 4 // D-38678 Clausthal-Zellerfeld
Mail : Ingo.Macherius@tu-clausthal.de WWW: http://www.tu-clausthal.de/~inim/
Information!=Knowledge!=Wisdom!=Truth!=Beauty!=Love!=Music==BEST (Frank Zappa)
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From bosak at atlantic-83.Eng.Sun.COM Sun Mar 16 06:58:22 1997
From: bosak at atlantic-83.Eng.Sun.COM (Jon Bosak)
Date: Mon Jun 7 16:57:35 2004
Subject: MBone
In-Reply-To: <199703160343.EAA06818@kneipfix.rz.tu-clausthal.de> (message from Ingo Macherius on Sun, 16 Mar 1997 04:43:58 +0100 (MET))
Message-ID: <199703160657.WAA16049@boethius.eng.sun.com>
| I don't have the opportunity to go to WWW6, but luckily I have MBone
| connection and see there are four channels prepared. Unluckily I
| tested transmissions from the US and found the Germany-USA link
| insufficient to deliver understandable speech. So my question is,
| whether the sessions are recorded and avaliable for download. This
| would enable me to watch. Second question: What's the broadcast
| schedule for the XML sessions ?
You will have to ask the conference organizers about this.
http://www6conf.slac.stanford.edu/
Jon
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From Peter at ursus.demon.co.uk Sun Mar 16 10:07:00 1997
From: Peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun 7 16:57:35 2004
Subject: MBone
Message-ID: <4715@ursus.demon.co.uk>
In message <199703160343.EAA06818@kneipfix.rz.tu-clausthal.de> Ingo Macherius writes:
> This is a bit off topic, sorry.
Not to me, :-)
Don't worry,
I think that it could be useful in the future for xml-dev to
consider virtual working of various sorts. I have been extremely
impressed by the way
that the discussion on the list has gone, but there are clearly
areas where a more rapid feedback than e-mail would be useful.
Perhaps we are a year or so away, but we shall start to see other methods
like MBone being useful for bridging the Atlantic.
> I don't have the opportunity to go to WWW6, but luckily I have MBone
> connection and see there are four channels prepared. Unluckily I tested
> transmissions from the US and found the Germany-USA link insufficient
> to deliver understandable speech. So my question is, whether the sessions
> are recorded and avaliable for download. This would enable me to watch.
> Second question: What's the broadcast schedule for the XML sessions ?
> Thanks in advance.
If you get information, Ingo, it would be useful to post it here, along
with any other necessary information for connection. Any feedback from
the meeting would be useful.
P.
--
Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences
http://www.vsms.nottingham.ac.uk/
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From h.rzepa at ic.ac.uk Sun Mar 16 10:15:15 1997
From: h.rzepa at ic.ac.uk (Rzepa, Henry)
Date: Mon Jun 7 16:57:35 2004
Subject: MBone
In-Reply-To: <4715@ursus.demon.co.uk>
Message-ID:
>> I don't have the opportunity to go to WWW6, but luckily I have MBone
>> connection and see there are four channels prepared. Unluckily I tested
>> transmissions from the US and found the Germany-USA link insufficient
>> to deliver understandable speech.
In my experience, MBone has never proved really useful (I have
attended WWW2 and an IETF meeting where it was used, but
it was only a token really).
Henry Rzepa. +44 171 594 5774 (Office) +44 594 5804 (Fax)
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From elm at arbortext.com Mon Mar 17 19:52:16 1997
From: elm at arbortext.com (Eve L. Maler)
Date: Mon Jun 7 16:57:35 2004
Subject: Associating DSSSL style sheets with documents
Message-ID: <3.0.16.19970317144739.1dbfb74c@village.doctools.com>
At 03:40 PM 3/15/97 +0700, James Clark wrote:
>At 13:10 14/03/97 -0500, Eve L. Maler wrote:
...
>>This is interesting: Should an XML effort determine a PI that should be
>>usable in general by SGML documents?
>
>I wasn't proposing that *XML* define such a PI. All I was just suggesting
>was that people who have DSSSL engines implement it (preferably making the
>name of the PI configurable).
Oh, I see.
>>I tend to think that the "authority"
>>that invents/maintains the format of the PI should be identified, and "XML"
>>sort of fits the bill, similarly to >token in a PI functions as a sort of notation. It would be weird for an
>>XML spec to specify >
>>I've also been beating the drum on the WG list about how our PIs should
>>have "GIs" as well as "attribute specs," so I'd prefer to see >stylesheet att1="val1" att2="val2"... ?>. This way, ">so that it will be processed by an XML-aware processor, and the rest
>>identifies the semantics of the instruction.
>
>I disagree. XML requires that all PIs start with a name, and says that this
>name is normally the name of a declared notation. So I think PIs should
>look like
>
>
I'm not sure how your second sentence follows. Why not have XML as the
notation (that is, XML-handling processors should operate on this PI) and
still have a "GI" that indicates the subclass of XML PI? (But see below
also.)
>(Note that the currently-defined XML PI fits this pattern not the one you
>suggest.) The authority should come from the public identifier on the
>notation declaration for name. Since XML reserves all names beginning with
>XML-, I would think that an XML-defined PI should look like:
>
>
This is a good point. In that case, then the XML PI at the top should
start with ", then you can't easily distinguish among the
PIs by type.
Eve
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From bosak at atlantic-83.Eng.Sun.COM Tue Mar 18 00:50:58 1997
From: bosak at atlantic-83.Eng.Sun.COM (Jon Bosak)
Date: Mon Jun 7 16:57:35 2004
Subject: XML hot from the oven
Message-ID: <199703180049.QAA28043@boethius.eng.sun.com>
I am pleased to announce that we are now serving XML as an
experimental alternative data format from our corporate document
server, docs.sun.com. Documents in the SGML repository at
docs.sun.com are autochunked and converted on the fly to XML. This
simply mirrors the server's primary function of converting SGML on the
fly to HTML; the main difference is that the job of converting to XML
is currently almost an identity transformation and is therefore much
easier.
docs.sun.com(sm) is itself experimental and unpublicized, so this is
an experiment running on top of an experiment, but we are proud to
claim the honor of having the world's first publicly visible XML
server. While the XML data stream is extremely raw in this first
implementation, the document repository is not; docs.sun.com currently
provides more than half of the total Solaris 2.5.1 manual set online,
and all of it can now be accessed as an XML data stream.
Kudos to the SunSoft AnswerBook team for making this service available
on top of everything else they are doing to meet our Solaris release
schedules.
HOW TO GET IT
The SGML-based AnswerBook2 (ab2) manuals on docs.sun.com are organized
into several large categories (alluser, sysadmin, etc.) with a number
of books in each catagory. Thus, the Solaris Advanced User's Guide is
referred to in URLs as /ab2/alluser/ADVOSUG. Two forms of XML access
are currently supported: TOCs and document chunks. TOCs are accessed
via the @xmlToc template, and chunks are accessed via the @xmlChunk
template. The @xmlToc template always shows a table of contents down
to the chapter level, no matter what level it is invoked at.
To see the XML server in action, telnet to docs.sun.com with the
command
telnet docs.sun.com 80
When connected, you can issue one of several kinds of GET command to
cause an HTTP transfer. For example:
1. To get a chapter-level TOC of the entire contents of the server:
get /ab2/@xmlToc http/1.0
2. To get a chapter-level TOC of the manuals in the alluser category:
get /ab2/alluser/@xmlToc http/1.0
3. To get a chapter-level TOC of the Solaris Advanced User's Guide:
get /ab2/alluser/ADVOSUG/@xmlToc http/1.0
4. To get a particular chapter from the manual (as listed in the TOC):
get /ab2/alluser/ADVOSUG/@xmlChunk/113 http/1.0
Note that HTTP GET commands must always be terminated with TWO
carriage returns before anything happens. Hint: you will find the
output easier to handle if you do all this from within an emacs shell
session.
Beyond its primary goal of giving us bragging rights, this service is
intended to provide a large-scale test bed for XML experimenters. At
the moment, all we can do is the simple identity transform from the
DocBook-tagged source, but in a few days we will have permissions set
up to go in and provide multiple alternative treatments in order to
explore different kinds of delivery strategies (for example, the
generation of SGML Open fragment wrappers vs. full server-side entity
resolution, or embedded CSS style attributes vs. associated dsssl-o
style sheets). We hope that this service will help to further the
evolution of XML by giving all you developers a rich set of
alternatives to play with.
Have fun!
Jon
----------------------------------------------------------------------
Jon Bosak, Online Information Technology Architect, Sun Microsystems
----------------------------------------------------------------------
2550 Garcia Ave., MPK17-101, Mountain View, California 94043
Davenport Group::SGML Open::NCITS V1::ISO/IEC JTC1/SC18/WG8::W3C XML
Here's a little game you can all join in with
It's very simple and I hope it's new
Make your own tags up if you want to
Any old tags that you think will do
----------------------------------------------------------------------
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From lee at sq.com Tue Mar 18 05:16:25 1997
From: lee at sq.com (lee@sq.com)
Date: Mon Jun 7 16:57:35 2004
Subject: Associating DSSSL style sheets with documents
Message-ID: <9703180516.AA16683@sqrex.sq.com>
Jon Bosak wrote:
> One possible method suggested by James Clark (thank you, James) is to
> adopt the convention used by Jade in the absence of the -d option:
> replace the extension of the document entity's URL or file name with
> .dsl and fetch that. Thus, if a browser fetches
> > http://docs.sun.com/foo/bar.html
> > then it should also look for
> > http://docs.sun.com/foo/bar.dsl
> > and apply it to bar.html if found.
Note that if you are generating the XML from a CGi script, a Java
server plugin (e.g. Solaris 2.6's upcoming server) or otherwise,
you probably need to make sure that clients don't try to look for
files in the same "directory" as your SGML.
E.g. http://docs.su.com/ab2/alluser/ADVOSUG/@xmlChunk/113.dsl
does a database query into presumably DynaBase (right, Jon?).
In this case, you want a processing instruction (or some other markup)
to say that
* there is no catalog file
http://docs.su.com/ab2/alluser/ADVOSUG/@xmlChunk/catalog
* the dtd is not accessible at
http://docs.su.com/ab2/alluser/ADVOSUG/@xmlChunk/113.dtd
(and _this_ is where it is...)
* the style sheet isn't there either
(and _this_ is where it is...)
We had to do this for SoftQuad Panorama for exactly this reason.
For example, John Price-Wilkin served up the Middle English Corpus
in SGML using PAT, but couldn't easily cope with Panorama looking
for CATLOG or catalog in the middle of a database query
In general, if you find yourself doing probes to see if files exist
using http, you've probably made a design error somewhere, as this
isn't a good use of http.
So allow the processing instructions.
Lee
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From bosak at atlantic-83.Eng.Sun.COM Tue Mar 18 16:35:09 1997
From: bosak at atlantic-83.Eng.Sun.COM (Jon Bosak)
Date: Mon Jun 7 16:57:35 2004
Subject: Associating DSSSL style sheets with documents
In-Reply-To: <9703180516.AA16683@sqrex.sq.com> (lee@sq.com)
Message-ID: <199703181629.IAA28823@boethius.eng.sun.com>
[Liam Quin:]
| Note that if you are generating the XML from a CGi script, a Java
| server plugin (e.g. Solaris 2.6's upcoming server) or otherwise,
| you probably need to make sure that clients don't try to look for
| files in the same "directory" as your SGML.
|
| E.g. http://docs.su.com/ab2/alluser/ADVOSUG/@xmlChunk/113.dsl
| does a database query into presumably DynaBase (right, Jon?).
No, DynaWeb. But your point is well taken.
| So allow the processing instructions.
When we start downloading a DSSSL stylesheet from the server, I think
that this is probably the method we'll try first. Of all the
alternatives, I like James Clark's last suggestion best for initial
experimentation:
Jon
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From lee at sq.com Tue Mar 18 18:48:30 1997
From: lee at sq.com (lee@sq.com)
Date: Mon Jun 7 16:57:35 2004
Subject: Associating DSSSL style sheets with documents
Message-ID: <9703181848.AA07514@sqrex.sq.com>
> >E.g. http://docs.su.com/ab2/alluser/ADVOSUG/@xmlChunk/113.dsl
> >does a database query into presumably DynaBase (right, Jon?).
>
> NO! This is DynaWeb!
Sorry for the error -- I meant DynaWeb. Honest.
> >In this case, you want a processing instruction (or some other markup)
> >to say that
> >* there is no catalog file
> > http://docs.su.com/ab2/alluser/ADVOSUG/@xmlChunk/catalog
>
> You *could* generate a catalog, which would point at the DTD and the
> stylesheets.
If DynaWeb had been less powerful, or you (or Jon in this case!) less
familiar with it, that may not have been an option -- with some other
SGML databases I've seen, it'd be quite hard. One way would be to have
a shell script front end that special-cases all files called "catalog"
and returns a hard-wired catalog file... but even that isn't always
easy in this world of automatically-generated CGI programs with special
hooks into the servers, so you can't simply unhook them a little.
So believe me (please!), there will be people, perhaps not using DynaWeb,
who can't or won't put a catalog file in there.
> >* the dtd is not accessible at
> > http://docs.su.com/ab2/alluser/ADVOSUG/@xmlChunk/113.dtd
> > (and _this_ is where it is...)
> ...
> >* the style sheet isn't there either
> > (and _this_ is where it is...)
>
> [...] It would be quite possible to resolve all of the things you
> outline above inside the configuration files (easy even).
Now do it with Astoria, Documentum, Saros DM, Texel, etc., including
handling a server login to fetch the catalog file, a server login
to fetch the style sheet, a server login to fetch the DTD, and a bunch
of impatient users. Yes, you coud say the web front ends could cache
recent login connections so they didn't log in again each time, but
generally they don't seem to do that.
Then deal with systems that can't deal with the DTD inside the database.
(if DTDs were in SGML format... but that's another issue)
> >In general, if you find yourself doing probes to see if files exist
> >using http, you've probably made a design error somewhere, as this
> >isn't a good use of http.
>
> Agreed!
Heh!
> >So allow the processing instructions.
>
> Or use catalogs.
Well, I'm not saying forbid catalogs, although I can't abide the thought of
mandating all that code for XML-compliant application. I'm suggesting
providing an alternative.
Our experience with conneting Panorama with a wide range of databases has
been that we needed to do this. Maybe if all the databases had been
built by Gavin :-) we'd have been able to stick with Catalogs, and we'd
always have known where to look for catalog even with URLs like
http://www.xxx.zzz/bin/get-doc/40197&user=z305&pass=df4ec5c9&d=113&f=7
where d=113 is the document chunk ID, get-doc is the program, 40197 is a
PATH_INFO parameter used for versioning, and the URL for CATALOG is
http://www.xxx.zzz/bin/get-doc/40197&user=z305&pass=df4ec5c9&d=491&f=7
and no, I'm not making this up (except I've changed the field names from
those used in any one particular currently shipping commercial system).
Panorama's default algorithm would look for
http://www.xxx.zzz/bin/get-doc/catalog
which obviously won't work in this case.
So we need to say where to find the CAALOG file so we can find where to
find the DTD. Or, we put an explicit URL to the DTD. There's somewhere
to do that in SGML, but not for a style sheet or a navspec/table of contents
definition file, nor any other ancilliary non-SGML files. So we use
processing instructions in those cases where it's necessary.
Does that make a better case?
If people end up saying no, it's clear that all the commercial applications
will do this anyway, but each in their own incompatible way.
I hereby volunteer us to be amongst the first :-)
Lee
--
Liam Quin, lee@sq.com | lq-text freely available Unix text retrieval
Senior Technical Consultant | FAQs: Metafont fonts, OPEN LOOK UI, OpenWindows
SoftQuad Inc. +1 416 544-9000 | xfonttool (Unix xfontsel in XView)
http://www.softquad.com/ | the barefoot programmer
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From lee at sq.com Tue Mar 18 20:24:37 1997
From: lee at sq.com (lee@sq.com)
Date: Mon Jun 7 16:57:35 2004
Subject: Associating DSSSL style sheets with documents
Message-ID: <9703182024.AA10994@sqrex.sq.com>
> Hmm. How much is "all that code"? I got 1443 lines of code for a catalog
> parser in C++, including comments.
Remember our dirty perl hacker and the graduate student who is supposed
to be able to write an XMLparser in a week? That was a big goal initially.
> >Our experience with conneting Panorama with a wide range of databases has
> >been that we needed to do this. Maybe if all the databases had been
> >built by Gavin :-) we'd have been able to stick with Catalogs, and we'd
> >always have known where to look for catalog even with URLs like
> > http://www.xxx.zzz/bin/get-doc/40197&user=z305&pass=df4ec5c9&d=113&f=7
> >where d=113 is the document chunk ID, get-doc is the program, 40197 is a
> >PATH_INFO parameter used for versioning, and the URL for CATALOG is
> > http://www.xxx.zzz/bin/get-doc/40197&user=z305&pass=df4ec5c9&d=491&f=7
> >and no, I'm not making this up (except I've changed the field names from
> >those used in any one particular currently shipping commercial system).
>
> Hmm. Looks very much like Astoria to me.
No, actually.
> >Panorama's default algorithm would look for
> > http://www.xxx.zzz/bin/get-doc/catalog
> >which obviously won't work in this case.
>
> Depends on how smart get-doc is.
Suppose it's written in C and hard-linked into the web server.
Suppose it was supplied by a commercial vendor, and changing or replacing
it invalidates the support contract for a $500,000 installation...
> >So we need to say where to find the CATALOG file so we can find where to
> >find the DTD. Or, we put an explicit URL to the DTD. There's somewhere
> >to do that in SGML, but not for a style sheet or a navspec/table of contents
> >definition file, nor any other ancilliary non-SGML files. So we use
> >processing instructions in those cases where it's necessary.
>
> If you can do it using PI's, you can do it using catalogs.
I don't believe this dogma :-)
> The only real points in favor of PI's are:
>
> 1) It does simplify clients *a little* (no need for catalog parsing,
> though resolution is still required).
> 2) They're simpler to hack into a server.
>
> neither of which carries much technical weight.
I don't care. If it's the difference between
"our integration team can do this"
and
"our product development or serious programming team could do this"
it's the difference between succeed and fail.
So allow both, OK? Are you really so set against PIs here?
Is there a (non-religious) reason?
They could be significant comments if you prefer --
like httpd server side includes/execs... (ugh)
Lee
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From paul at arbortext.com Tue Mar 18 21:58:59 1997
From: paul at arbortext.com (Paul Grosso)
Date: Mon Jun 7 16:57:35 2004
Subject: Associating DSSSL style sheets with documents
Message-ID: <9703182148.AA03271@atiaus.arbortext.com>
> From: lee@sq.com
>
> Remember our dirty perl hacker and the graduate student who is supposed
> to be able to write an XMLparser in a week? That was a big goal initially.
For what it's worth...
THe desperate perl hacker was someone trying to write a perl script to
do some basic data massaging to some marked up XML. We never had as a
goal that someone could write an XML parser in perl.
As far as the grad student, I believe we were giving them two weeks to
write an XML parser.
Finally, let's not die on our own sword here. The main goal is to have
XML be widely accepted. A subgoal of that is to make it relatively
easy to write an XML parser, but it still has to be worthwhile to
write that parser in the first place, or we've lost the war. I'm
not saying that catalogs are absolutely required for XML to work,
but I do think we need to look at the big picture, not count lines
of code, to determine the right answer.
paul
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From dseibert at sqwest.bc.ca Tue Mar 18 22:23:54 1997
From: dseibert at sqwest.bc.ca (David Seibert)
Date: Mon Jun 7 16:57:35 2004
Subject: Associating DSSSL style sheets with documents
Message-ID: <01BC33A7.DE903900@sqruffy.west.sq.com>
"Design Principles for XML" actually says that "the holder of a CS
bachelor's degree should be able to construct basic processing (parsing,
if not validating) machinery in less than a week". Making that two weeks
would be pretty significant slippage.
More important: if you want XML to be widely accepted, you don't want
to enforce complications that aren't necessary for everyone. Catalogs are
useful, but they aren't so easy to implement, so a lot of people would
prefer PIs as a less complicated alternative. James's suggestion for a PI
form,
is concise, has all of the necessary information, and is close to the HTML
syntax to make the transition easier for HTML authors. I can't improve on
that.
David
----------
From: Paul Grosso
Sent: Tuesday, March 18, 1997 1:48 PM
To: xml-dev@ic.ac.uk
Subject: Re: Associating DSSSL style sheets with documents
> From: lee@sq.com
>
> Remember our dirty perl hacker and the graduate student who is supposed
> to be able to write an XMLparser in a week? That was a big goal initially.
For what it's worth...
THe desperate perl hacker was someone trying to write a perl script to
do some basic data massaging to some marked up XML. We never had as a
goal that someone could write an XML parser in perl.
As far as the grad student, I believe we were giving them two weeks to
write an XML parser.
Finally, let's not die on our own sword here. The main goal is to have
XML be widely accepted. A subgoal of that is to make it relatively
easy to write an XML parser, but it still has to be worthwhile to
write that parser in the first place, or we've lost the war. I'm
not saying that catalogs are absolutely required for XML to work,
but I do think we need to look at the big picture, not count lines
of code, to determine the right answer.
paul
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From lee at sq.com Tue Mar 18 23:27:41 1997
From: lee at sq.com (lee@sq.com)
Date: Mon Jun 7 16:57:35 2004
Subject: Associating DSSSL style sheets with documents
Message-ID: <9703182327.AA20037@sqrex.sq.com>
> > Remember our dirty perl hacker and the graduate student who is supposed
> > to be able to write an XMLparser in a week? That was a big goal initially.
>
> THe desperate perl hacker was someone trying to write a perl script to
> do some basic data massaging to some marked up XML. We never had as a
> goal that someone could write an XML parser in perl.
I neither said that nor implied it.
> As far as the grad student, I believe we were giving them two weeks to
> write an XML parser.
I think it varied -- the main point was that it wasn't 3 months, I think,
and the language has to be straight-forward, simple and self-contained
enough the the grad student _wants_ to do the parser.
> Finally, let's not die on our own sword here. The main goal is to have
> XML be widely accepted. A subgoal of that is to make it relatively
> easy to write an XML parser, but it still has to be worthwhile to
> write that parser in the first place, or we've lost the war.
Yes, that's true, I agree.
Lee
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From nmikula at edu.uni-klu.ac.at Wed Mar 19 07:59:52 1997
From: nmikula at edu.uni-klu.ac.at (Norbert H. Mikula)
Date: Mon Jun 7 16:57:36 2004
Subject: Associating DSSSL style sheets with documents
References: <199703181629.IAA28823@boethius.eng.sun.com>
Message-ID: <33301646.76D2@edu.uni-klu.ac.at>
Jon Bosak wrote:
> When we start downloading a DSSSL stylesheet from the server, I think
> that this is probably the method we'll try first. Of all the
> alternatives, I like James Clark's last suggestion best for initial
> experimentation:
>
>
I think that's ok, but it also creates a pain in
my stomache. Does it it mean I have to fetch the stylessheet
each time for each document instance ? The user agent to my DSSSL
engine supports caching (with a primitive caching heuristic,
I have to admit). Should I base the lookup on "href=" or
could (should) we include a (F)PI so that it reads like :
PS: In general, I am still not fond of PIs. But I have to admit
that it is more a religious than a practical point of view.
--
Best regards,
Norbert H. Mikula
=====================================================
= SGML, DSSSL, Intra- & Internet, AI, Java
=====================================================
= mailto:nmikula@edu.uni-klu.ac.at
= http://www.edu.uni-klu.ac.at/~nmikula
=====================================================
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From nmikula at edu.uni-klu.ac.at Wed Mar 19 08:00:36 1997
From: nmikula at edu.uni-klu.ac.at (Norbert H. Mikula)
Date: Mon Jun 7 16:57:36 2004
Subject: Associating DSSSL style sheets with documents
References: <01BC33A7.DE903900@sqruffy.west.sq.com>
Message-ID: <33300FC1.5C7C@edu.uni-klu.ac.at>
David Seibert wrote:
> More important: if you want XML to be widely accepted, you don't want
> to enforce complications that aren't necessary for everyone. Catalogs are
> useful, but they aren't so easy to implement,
Compared to other problems that I was (am) having, catalogs are
*straightforward* to implement. All in all it takes you about
three days to implement.
> so a lot of people would
> prefer PIs as a less complicated alternative. James's suggestion for a PI
> form,
>
> is concise, has all of the necessary information, and is close to the HTML
> syntax to make the transition easier for HTML authors. I can't improve on
> that.
Catalogs are very important concepts for other things as well. If
somebody
doesn't want to use catalogs, he doesn't have to. Allowing for catalogs
doesn't really complicate the specs of XML and doesn't make it more
difficult
to learn it.
> As far as the grad student, I believe we were giving them two weeks to
> write an XML parser.
:-) Assuming that you know the tools and the programming language that
you are using, two or rather three weeks is a fair estimation for a
non-validating
XML processor with no support for catalogs and public identifiers.
Yet another requirement is that we get a revision of spec with all the
missing productions (mostly S) included and some of the productions
fixed
and/or clearified.
> Finally, let's not die on our own sword here. The main goal is to have
> XML be widely accepted. A subgoal of that is to make it relatively
> easy to write an XML parser, but it still has to be worthwhile to
> write that parser in the first place, or we've lost the war. I'm
> not saying that catalogs are absolutely required for XML to work,
> but I do think we need to look at the big picture, not count lines
> of code, to determine the right answer.
* Strongly Agree *
--
Best regards,
Norbert H. Mikula
=====================================================
= SGML, DSSSL, Intra- & Internet, AI, Java
=====================================================
= mailto:nmikula@edu.uni-klu.ac.at
= http://www.edu.uni-klu.ac.at/~nmikula
=====================================================
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From dseibert at sqwest.bc.ca Wed Mar 19 17:24:34 1997
From: dseibert at sqwest.bc.ca (David Seibert)
Date: Mon Jun 7 16:57:36 2004
Subject: Associating DSSSL style sheets with documents
Message-ID: <01BC3447.1DC4F260@sqruffy.west.sq.com>
We agree. I am certainly in favor of allowing catalogs, as long as PIs
are also allowed, I just don't want to force people to use them.
As far as parsing time, I agree on the estimate of 2-3 weeks for a
non-validating parser with no catalog or public identifier handling.
However, I think 1 week will not be unreasonable for someone who has
learned to use yacc and lex (or some equivalents), _after_ the grammar
is specified properly. I am spending about half of my time cleaning up
incorrect productions (fortunately I have Peter Sharpe here to clarify
what content is supposed to be allowed). When I have time to get it
into readable shape (I'm not using standard yacc and lex, so I should
normalize it), I'll post my corrected grammar to xml-dev.
Regards,
David
----------
From: Norbert H. Mikula
Sent: Wednesday, March 19, 1997 8:09 AM
To: David Seibert
Cc: xml-dev@ic.ac.uk
Subject: Re: Associating DSSSL style sheets with documents
David Seibert wrote:
> More important: if you want XML to be widely accepted, you don't want
> to enforce complications that aren't necessary for everyone. Catalogs are
> useful, but they aren't so easy to implement,
Compared to other problems that I was (am) having, catalogs are
*straightforward* to implement. All in all it takes you about
three days to implement.
> so a lot of people would
> prefer PIs as a less complicated alternative. James's suggestion for a PI
> form,
>
> is concise, has all of the necessary information, and is close to the HTML
> syntax to make the transition easier for HTML authors. I can't improve on
> that.
Catalogs are very important concepts for other things as well. If
somebody
doesn't want to use catalogs, he doesn't have to. Allowing for catalogs
doesn't really complicate the specs of XML and doesn't make it more
difficult
to learn it.
> As far as the grad student, I believe we were giving them two weeks to
> write an XML parser.
:-) Assuming that you know the tools and the programming language that
you are using, two or rather three weeks is a fair estimation for a
non-validating
XML processor with no support for catalogs and public identifiers.
Yet another requirement is that we get a revision of spec with all the
missing productions (mostly S) included and some of the productions
fixed
and/or clearified.
> Finally, let's not die on our own sword here. The main goal is to have
> XML be widely accepted. A subgoal of that is to make it relatively
> easy to write an XML parser, but it still has to be worthwhile to
> write that parser in the first place, or we've lost the war. I'm
> not saying that catalogs are absolutely required for XML to work,
> but I do think we need to look at the big picture, not count lines
> of code, to determine the right answer.
* Strongly Agree *
--
Best regards,
Norbert H. Mikula
=====================================================
= SGML, DSSSL, Intra- & Internet, AI, Java
=====================================================
= mailto:nmikula@edu.uni-klu.ac.at
= http://www.edu.uni-klu.ac.at/~nmikula
=====================================================
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From dgd at cs.bu.edu Wed Mar 19 18:19:19 1997
From: dgd at cs.bu.edu (David Durand)
Date: Mon Jun 7 16:57:36 2004
Subject: CATALOGs and stylesheets
Message-ID:
There was a proposed CATALOG extension (even imnplemented, I think) for a
DOCUMENT(?) keyword that stated the starting for which the catalog applies.
With delegation this presents an alternative mechanism that will not be
fooled by mytery URLs. Each document with "attachments" has a catalog that
gives its URL and gives its DTD and stylesheet(s). Delegation is used to
make catalog management bearable for files that share public Identifiers,
so that common stuff resides in a common catalog.
Then the URL that you send is the CATALOG URL, not the document URL --
and you get a whole directory of the stuff you might need, with the
potential for any mapping you want from URIs to URLs.
I'm not a CATALOG zealot, but it's an approach that bears consideration.
-- David
_________________________________________
David Durand dgd@cs.bu.edu \ david@dynamicDiagrams.com
Boston University Computer Science \ Sr. Analyst
http://www.cs.bu.edu/students/grads/dgd/ \ Dynamic Diagrams
--------------------------------------------\ http://dynamicDiagrams.com/
MAPA: mapping for the WWW \__________________________
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From nmikula at edu.uni-klu.ac.at Wed Mar 19 19:03:22 1997
From: nmikula at edu.uni-klu.ac.at (Norbert Mikula)
Date: Mon Jun 7 16:57:36 2004
Subject: CATALOGs and stylesheets
In-Reply-To:
Message-ID:
On Wed, 19 Mar 1997, David Durand wrote:
> There was a proposed CATALOG extension (even imnplemented, I think) for a
> DOCUMENT(?) keyword that stated the starting for which the catalog applies.
NXP supports Catalogs (including the DOCUMENT keyword).
Best regards,
Norbert H. Mikula
=====================================================
= SGML, DSSSL, Intra- & Internet, AI, Java
=====================================================
= mailto:nmikula@edu.uni-klu.ac.at
= http://www.edu.uni-klu.ac.at/~nmikula
=====================================================
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From dseibert at sqwest.bc.ca Wed Mar 19 19:07:29 1997
From: dseibert at sqwest.bc.ca (David Seibert)
Date: Mon Jun 7 16:57:36 2004
Subject: CATALOGs and stylesheets
Message-ID: <01BC3455.B16BD570@sqruffy.west.sq.com>
That sounds great, and mixes well with the PI approach. Authors who want a
simple approach can just enclose the stylesheet URL in a PI inside the XML
document. More sophisticated authors can do the same, and then label the
entire document with a single catalog URL, and the separate chunks with
different URLs. (I haven't read the CATALOG extension, so I hope that I am
interpreting David's remarks correctly).
The sophisticated server could give the full catalog entry to sophisticated
clients, who would negotiate what to send; they would then presumably parse
the PI (they need to do this to deal with simple servers) and realize that they
already had the stylesheet. Unsophisticated clients could just get the XML
text by default if they requested the base document, and then request the
remaining chunks that they wanted. Thus, the presence of the catalog could
be made transparent to unsophisticated clients.
This separation of the PI and catalog mechanisms (keeping one internal to
the XML document and the other external) allows simple clients and servers to
peacefully coexist with sophisticated ones, with graceful degradation of
functionality. It's probably more appropriate as well, since clients
sophisticated enough to deal with the catalog should realize that the
document is really the whole collection of files, not just the XML file. Is there
a compelling argument to make catalogs visible to users?
Regards,
David Seibert
----------
From: David Durand
Sent: Wednesday, March 19, 1997 9:05 AM
To: xml-dev@ic.ac.uk
Subject: CATALOGs and stylesheets
There was a proposed CATALOG extension (even imnplemented, I think) for a
DOCUMENT(?) keyword that stated the starting for which the catalog applies.
With delegation this presents an alternative mechanism that will not be
fooled by mytery URLs. Each document with "attachments" has a catalog that
gives its URL and gives its DTD and stylesheet(s). Delegation is used to
make catalog management bearable for files that share public Identifiers,
so that common stuff resides in a common catalog.
Then the URL that you send is the CATALOG URL, not the document URL --
and you get a whole directory of the stuff you might need, with the
potential for any mapping you want from URIs to URLs.
I'm not a CATALOG zealot, but it's an approach that bears consideration.
-- David
_________________________________________
David Durand dgd@cs.bu.edu \ david@dynamicDiagrams.com
Boston University Computer Science \ Sr. Analyst
http://www.cs.bu.edu/students/grads/dgd/ \ Dynamic Diagrams
--------------------------------------------\ http://dynamicDiagrams.com/
MAPA: mapping for the WWW \__________________________
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From bosak at atlantic-83.Eng.Sun.COM Wed Mar 19 20:40:18 1997
From: bosak at atlantic-83.Eng.Sun.COM (Jon Bosak)
Date: Mon Jun 7 16:57:36 2004
Subject: Associating DSSSL style sheets with documents
In-Reply-To: <33301646.76D2@edu.uni-klu.ac.at> (nmikula@edu.uni-klu.ac.at)
Message-ID: <199703192038.MAA29744@boethius.eng.sun.com>
[Norbert Mikula:]
| >
|
| I think that's ok, but it also creates a pain in
| my stomache. Does it it mean I have to fetch the stylessheet
| each time for each document instance ?
I was assuming (naively?) that the target of the href would be cached
just like the target of any other URL.
Jon
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From bosak at atlantic-83.Eng.Sun.COM Wed Mar 19 21:09:35 1997
From: bosak at atlantic-83.Eng.Sun.COM (Jon Bosak)
Date: Mon Jun 7 16:57:36 2004
Subject: XML hot from the oven
Message-ID: <199703192108.NAA29825@boethius.eng.sun.com>
I pointed to the GET method for accessing the XML on docs.sun.com
because I was assuming that an experimenter would next want to
implement some kind of client to handle the data stream. If all you
want to do is view the output, then you can just feed the equivalent
URLs, e.g.
http://docs.sun.com/ab2/@xmlToc
http://docs.sun.com/ab2/alluser/ADVOSUG/@xmlChunk
to any ordinary Web browser and download the results to a file.
Jon
----------------------------------------------------------------------
Jon Bosak, Online Information Technology Architect, Sun Microsystems
----------------------------------------------------------------------
2550 Garcia Ave., MPK17-101, Mountain View, California 94043
Davenport Group::SGML Open::NCITS V1::ISO/IEC JTC1/SC18/WG8::W3C XML
Here's a little game you can all join in with
It's very simple and I hope it's new
Make your own tags up if you want to
Any old tags that you think will do
----------------------------------------------------------------------
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From nmikula at edu.uni-klu.ac.at Thu Mar 20 07:32:32 1997
From: nmikula at edu.uni-klu.ac.at (Norbert H. Mikula)
Date: Mon Jun 7 16:57:36 2004
Subject: Associating DSSSL style sheets with documents
References: <199703192038.MAA29744@boethius.eng.sun.com>
Message-ID: <33315F40.2AD6@edu.uni-klu.ac.at>
Jon Bosak wrote:
> | >
> |
> | I think that's ok, but it also creates a pain in
> | my stomache. Does it it mean I have to fetch the stylessheet
> | each time for each document instance ?
>
> I was assuming (naively?) that the target of the href would be cached
> just like the target of any other URL.
I think my suggestion with the (formal) public identifier is
more general. Your suggestion would work of course, but
if we have two URLs, for instance, http://www.jon.com/foo.dsl and
http://www.norbert.com/foo.dsl, they could be the same
stylesheet but they don't *have* to be.
Also extracting the stylesheet name as such wouldn't be the
best solution (foo.dsl) as there also might be ambiguities.
Your foo.dsl is not necessarily my foo.dsl.
However, if I cache "-//NHM//FOO STYLE//EN", then especially with
formal public idents, I (normally) wouldn't have these problems.
Right ?
--
Best regards,
Norbert H. Mikula
=====================================================
= SGML, DSSSL, Intra- & Internet, AI, Java
=====================================================
= mailto:nmikula@edu.uni-klu.ac.at
= http://www.edu.uni-klu.ac.at/~nmikula
=====================================================
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From bosak at atlantic-83.Eng.Sun.COM Thu Mar 20 16:57:10 1997
From: bosak at atlantic-83.Eng.Sun.COM (Jon Bosak)
Date: Mon Jun 7 16:57:36 2004
Subject: Associating DSSSL style sheets with documents
In-Reply-To: <33315F40.2AD6@edu.uni-klu.ac.at> (nmikula@edu.uni-klu.ac.at)
Message-ID: <199703201655.IAA12043@boethius.eng.sun.com>
[Norbert Mikula:]
| > | >
| > |
| > | I think that's ok, but it also creates a pain in
| > | my stomache. Does it it mean I have to fetch the stylessheet
| > | each time for each document instance ?
| >
| > I was assuming (naively?) that the target of the href would be cached
| > just like the target of any other URL.
|
| I think my suggestion with the (formal) public identifier is
| more general. Your suggestion would work of course, but
| if we have two URLs, for instance, http://www.jon.com/foo.dsl and
| http://www.norbert.com/foo.dsl, they could be the same
| stylesheet but they don't *have* to be.
Remember, all I was asking for in the first place was input into how
some of us could start doing this on an experimental basis. It isn't
up to this group to develop a standard solution; that's the job of the
W3C SGML working group. I think that the form above will work for an
initial experiment, and unless someone sees a basic problem with it,
that's what I'm going to try.
Jon
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From ht at cogsci.ed.ac.uk Fri Mar 21 12:01:00 1997
From: ht at cogsci.ed.ac.uk (Henry S. Thompson)
Date: Mon Jun 7 16:57:36 2004
Subject: Two more points for cleanup in existing draft
Message-ID: <3711.199703211200@grogan.cogsci.ed.ac.uk>
1a) Shouldn't the two occurences of '<' in production 16 (the
definition of QuotedCData) be replaced with '&', and if not, why not?
1b) Shouldn't production 15 (the definition of Literal) prohibit '&'
and '%' as well as the relevant quote character, for consistency with [16]?
2) 4.3, the discussion of entity treatment, is somewhat
unsatisfactory. '[P]arsed character data' is misleading, since by the
syntax PCData cannot contain references! If it means 'content and
QuotedCData' (which are the places entity references are allowed), it
should say so. Also, parameter entity processing is not discussed at all.
4.3.6 also needs careful attention, since as it stands it doesn't give
enough weight to the consequences of 2.1, and might lead the naive to
suppose that ". . .three companies: L&M; B&W; Imperial Tobacco"
is invalid, presuming M and W are not themselves defined as entities.
Indeed taken literally 4.3.6 might lead one to suppose that ANY use of
& is illegal, since PCData may not contain &, and 4.3.6 says
"processing this replacement data (which may contain both text and
markup) . . ." This needs to be clarified, in my view.
Here's a candidate redraft of the relevant bits:
--------------
4.3 XML allows character or general entity references in two places,
namely in Element content ([39]) or Quoted character data ([16]). The
names of external binary entities may also appear as/in the value of
an ENTITY or ENTITIES attribute. On encountering one of these
references, an XML processor shall:
. . .
2. For both character and entity references, the processor must not
pass the reference itself to the application.
3. For character references, the processor must pass the indicated
ISO 10646 bit pattern to the application in place of the reference.
. . .
6. For an internal (text) entity, the processor should process the
defined content of the reference on the same basis (i.e. as content or
QuotedCData) that licensed the reference in the first place, with due
regard to section 2.1 above, and pass the result to the application in
place of the reference, EXCEPT that the content of references
processed as QuotedCData MAY include single or double quotes ad lib.,
or may consist of a single '&' character. Similarly, the content of
references processed as 'content' MAY consist of either a single '<'
character or a single '&' character.
. . .
If the processor includes an external text entity under clauses (7) or
(8) above, the results shall be as for internal (text) entities as
defined in (6).
. . .
XML allows parameter entity references in three places, namely in
literals ([15]), the internal declaration subset ([33]) or the key
of a conditional section ([58]). Processing in this case is parallel
to that for internal (text) entities as defined in clause (6) above,
with the obvious extension to allow content consisting of a single '%'
character.
---------------
Note the use of the label 'content' for production [39] is extremely
infelicitous.
The bit about parameter entity references is important, as it makes
clear that the following is valid XML (as it is SGML):
'>
%yy;
]>
a &g; b
[nsgml says:
(FOO
-a f b
)FOO
C
]
Hope this helps.
ht
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From nmikula at edu.uni-klu.ac.at Fri Mar 21 14:44:08 1997
From: nmikula at edu.uni-klu.ac.at (Norbert Mikula)
Date: Mon Jun 7 16:57:36 2004
Subject: Two more points for cleanup in existing draft
In-Reply-To: <3711.199703211200@grogan.cogsci.ed.ac.uk>
Message-ID:
> The bit about parameter entity references is important, as it makes
> clear that the following is valid XML (as it is SGML):
>
>
>
> '>
> %yy;
> ]>
> a &g; b
>
> [nsgml says:
> (FOO
> -a f b
> )FOO
> C
> ]
I ran it with NXP. I had to make a few changes :
1.) --->
(The ERB hasn't decided yet on this subject, or has it ?)
2.) I had to change the position of yy and zz
(I wasn't thinking about this problem of refering to
an entity that was not yet declared. Now I need to check
carefully when NXP resolves (is supposed to resolve) entity references)
3.) -->
(Note the semicolon after zz !)
After these changes I got the same results.
FYI: I tested with latest release. It has not been published yet.
Best regards,
Norbert H. Mikula
=====================================================
= SGML, DSSSL, Intra- & Internet, AI, Java
=====================================================
= mailto:nmikula@edu.uni-klu.ac.at
= http://www.edu.uni-klu.ac.at/~nmikula
=====================================================
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From Peter at ursus.demon.co.uk Fri Mar 21 23:06:03 1997
From: Peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun 7 16:57:36 2004
Subject: Uncertainties in implementing WD-xml.html
Message-ID: <4909@ursus.demon.co.uk>
In message <3332D971.68F5@utila.ifi.uni-klu.ac.at> "Norbert H. MIKULA" writes:
> As far as I can remember the ERB has initially decided
> to change the syntax for comments to
>
> <--* ..... *-->
>
> (posted to the XML-WG : Wed, 15 Jan)
>
> But the torture.xml file of cmsmcq uses
>
> also during the discussion about the appropriate
> regexp people used both alternatives.
>
> What is the current state of things ?
I share Norbert's concern about uncertainties in the XML draft and feel that
a number of us are 'stalled' at present due to one or more uncertainties in
the spec. (It may be that these are simple misconceptions, but they need
tidying up.). We agree that the mythical grad student can hack a parser in the
mythical two weeks, but only if they have a clear spec to write to. [My own
position is that I want to extend JUMBO to read any WF XML file and intend to
do this on top of another parser, and I'd like to do this before WWW6 -
otherwise it can't be said to be an 'XML browser/editor'.]
My understanding is that the productions (1-77) are consistent and can be
used as the basis of a yacc-like approach (as NXP does, using JACC). So the
first question is [see Norbert's query]:
(a) are we agreed that (1-77) in WD-xml-961114 are the current version and
that none are under revision at present? (Until Norbert's question I had
assumed that [21] (Comments) was correct).
(b) some parsing operations (e.g. entity replacement) are not described in
the BNF and are sufficiently complex or insufficiently documented to give
serious problems in implementation. It would be valuable for these to be
listed and the operations clearly defined (e.g. are comments processed before
entity replacement? are nested entities allowed? etc.)
(c) some ancillary constructs (e.g. CATALOG) are widely held to be part of
XML (or likely to be part of XML). They are probably not too difficult to
implement if certain processes (e.g. resolution of FPIs) are not exhaustively
defined.
IMO it is more important to resolve this asap, than other aspects of developing
a parser. The worst possible thing to happen at this stage is that developers
have sufficient uncertainty in the spec that there are different interpretations
Against normal practice I have crossposted this to xml-dev. If the ERB feel
this is mainly a matter of clarification, then a reply to xmp-dev would be
adequate, but if (as I fear) some aspects of entity replacement are not
universally agreed, then I think they need to be resolved here.
P
--
Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences
http://www.vsms.nottingham.ac.uk/
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From Peter at ursus.demon.co.uk Fri Mar 21 23:53:47 1997
From: Peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun 7 16:57:36 2004
Subject: XML hot from the oven
Message-ID: <4912@ursus.demon.co.uk>
In message <199703192108.NAA29825@boethius.eng.sun.com> bosak@atlantic-83.Eng.Sun.COM (Jon Bosak) writes:
> I pointed to the GET method for accessing the XML on docs.sun.com
> because I was assuming that an experimenter would next want to
> implement some kind of client to handle the data stream. If all you
> want to do is view the output, then you can just feed the equivalent
> URLs, e.g.
>
> http://docs.sun.com/ab2/@xmlToc
> http://docs.sun.com/ab2/alluser/ADVOSUG/@xmlChunk
>
> to any ordinary Web browser and download the results to a file.
Or you could use an XML tool as a helper application for a browser. This
could be done in a .mailcap file or by configuring the browser. For JUMBO
the .mailcap file looks like
text/xml; java pmr.chemime.ChemTree %s chemical/x-cml
and this should be able to deal with Jon's Shakespeare. (Unfortunately
JUMBO doesn't deal with all XML constructs yet).
P.
Note, of course, that the default view of any XML browser may not be
very informative. PLAY comes out very well in JUMBO, but the Solaris
docs would need subclasses written for several of the elements.
P.
--
Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences
http://www.vsms.nottingham.ac.uk/
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From bosak at atlantic-83.Eng.Sun.COM Sat Mar 22 01:50:56 1997
From: bosak at atlantic-83.Eng.Sun.COM (Jon Bosak)
Date: Mon Jun 7 16:57:36 2004
Subject: docs.sun: changes to XML output
Message-ID: <199703220149.RAA14332@boethius.eng.sun.com>
We've been making changes to the TOC output from docs.sun.com. A bug
in container closing has been fixed (thanks, Norbert), and we've
adopted a convention for properly structuring TOC output so that the
TOC always forms a single tree. Thus, for example,
http://docs.sun.com/ab2/alluser/ADVOSUG/@xmlToc will give you
...
...
...
http://docs.sun.com/ab2/alluser/@xmlToc will give you
...
...
...
and http://docs.sun.com/ab2/@xmlToc will give you
...
...
...
...
...
...
...
...
...
...
...
...
That's the idea, anyway.
By the way, if you start wondering what Sun thinks this server is for,
try http://docs.sun.com all by itself.
Jon
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From Peter at ursus.demon.co.uk Sat Mar 22 10:30:45 1997
From: Peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun 7 16:57:36 2004
Subject: Lark
Message-ID: <4927@ursus.demon.co.uk>
Tim, and xml-dev'ers
I am using the Jan 3 version of Lark. Is there a later version?
If so, the rest of this posting may be ignored at present.
[If the errors are due to my incompetence, please be gentle :-)]
I have some problems at the start of the document:
If I include the magic incantation:
then doDoctype(Entity e, String rootID, String externalSubsetID)
sets rootId to VERSION="1.0". If I comment out the
statement, then it performs as I would expect.
If I run Lark on a file with no SYSTEM or PUBLIC in the DOCTYPE
it throws an error.
----------------------------------------------------------------
Please don't anything here as a criticism of Lark... (or NXP, or any other
pasrser that might appear shortly).
I think it's very important that by WWW6 NXP and Lark are able
to read a wide range of examples without errors. The primary task is
that we make sure that we all agree on how to read well-formed
files. If someone writes a DTDless 'XML' file and brings it to WWW6 then
either:
- it should parse without errors on all parsers
- all parsers should inform of at least one error (I assume that
parser developers are *allowed* to stop at the first error,
however incovenient this mighty be.)
Ideally we should be able to read torture files uniformly, though I suspect
that certain bizarre constructs can still be created which throw most parsers.
P.
--
Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences
http://www.vsms.nottingham.ac.uk/
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From jjc at jclark.com Sat Mar 22 13:09:24 1997
From: jjc at jclark.com (James Clark)
Date: Mon Jun 7 16:57:37 2004
Subject: Associating DSSSL style sheets with documents
Message-ID: <2.2.32.19970322125754.0076a558@jclark.com>
At 08:29 18/03/97 -0800, Jon Bosak wrote:
>| So allow the processing instructions.
>
>When we start downloading a DSSSL stylesheet from the server, I think
>that this is probably the method we'll try first. Of all the
>alternatives, I like James Clark's last suggestion best for initial
>experimentation:
>
>
I've just implemented this in Jade. For the benefit of others implementing
DSSSL or XML here are the details:
- I recognize the PI anywhere in the prolog (so you can put it an external DTD).
- When there are multiple such PIs, I give the first precedence.
- I allow any of text/dsssl, text/x-dsssl, application/dsssl and
application/x-dsssl for the type. The type is case insensitive.
- I recognize
I also plan to implement something to allow catalogs to be used as an
alternative to PIs, but I haven't decided what yet.
James
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From Peter at ursus.demon.co.uk Sat Mar 22 19:46:38 1997
From: Peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun 7 16:57:37 2004
Subject: JUMBO
Message-ID: <4939@ursus.demon.co.uk>
JUMBO is a prototype browser/editor/search/transformation tool for
XML documents. I have now managed to bolt in both Lark and NXP
instead of my parser (which was crude and did not support some of the
XML constructs). The bolting-in is still rather crude and concentrates
my mind on the need for a simple API at this level. Here are some comments
which may be useful.
NXP.
----
NXP has an interface Esis, with function such as open_tag, close_tag,
process_instruction, etc. [I think they would be more properly called
start_element??]. JUMBO uses this to build up a Vector representing the
ESIS event stream, somthing like:
"_START_TAG" "CML" AttributeList "_START_TAG" "MOL" ... "_END_TAG" "MOL"...
JUMBO then builds a tree out of this, adding attributes, etc.
NXP has a class XML which is built by JACC. This contains inter alia
an Esis_Stdout object (implements Esis). There are several objects in XML
which are private and therefore not easily accessed - I think they should
have accessors, but at present I have subclassed it to PMRXML, which has
the requisiste accessors.
My test program then creates a PMRXML object, and extracts the event stream
which is then passed to JUMBO's existing tree object:
NXP.PMRXML xml = new PMRXML(NXP.Streams.load_File(file, true));
pmr.chemime.ChemTree chemTree = new ChemTree(xml.getStreamVector());
pmr.sgml.GeneralTOC toc = chemTree.createGeneralTOC(3);
Comments: I have still to work out what whitespace NXP creates - there seems
to be a lot of content which is simply white. Maybe we have to address
COLLAPSE and KEEP at this stage? Also it isn't easy to extract certain
info - for example I had to hack XML.java to get the doctype - this isn't a good
idea and we need an accessor. I am also still not clear how NXP does (or should)
behave with:
and
(the default on the latter is to try to validate, I think, even if validate
is set to false. I'd prefer to be able to turn off validation, but I may have
missed something).
In general I'd like to be able to treat NXP as a black box, and subclass
my Esis object. That could mean passing it as an argument to XML, e.g.:
public class PMREsis implements Esis {
public void open_tag(String name) {
...
}
}
PMREsis esis = new PMREsis();
NXP.XML xml = new NXP.XML(esis, NXP.Streams.load_File(file, true))
pmr.sgml.SGMLTree tree = new pmr.sgml.SGMLTree(xml);
and so on.
NXP is a validatin parser, but my DTDs are still struggling with Parameter
Entities so I have no experience here.
Lark
----
Lark creates a tree (called Lark) and provides a handler for
the user to pick up a variety of events (e.g. doDoctype(), doPI()). The
tree contains Elements ('Nodes') which have Attributes and a type (String).
Rather than subclassing these elements, I process Lark but iterating through
the Elements and creating a JUMBO SGMLTree (this can be delayed if required).
The tree seems complete, but I am not sure I have got all the doFOO routines
working correctly. I have also had problems with PIs (if the ?> delimiter
is used) - these may be mine.
Lark does not validate. However it is easy to interface and is fast.
General
-------
I do not use PIs myself though I shall start to do so. If they are
kept in the document tree, is there a convention where they live? (The last
opened element? What if they occur in PCDATA?).
I intend to make JUMBO available with both Lark and NXP but it's a bit creaky
at present and the interface is a bit slow. I have been told that the larger
the number of classes, the slower the program - any comments? Also I don't
know whether I should be deliberately garbage-collecting at this stage.
Any general thoughts would be welcome. I intend to bolt a crude search tool
into JUMBO along the TEI lines. I shall also see whether I can extract the
bits of NXP that do the validating, because then we have a crude validating
editor.
Any feedback from the current JUMBos would be appreciated. (I already know
it's slow, and the graphics creak in several places :-)
P.
--
Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences
http://www.vsms.nottingham.ac.uk/
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From tbray at textuality.com Sat Mar 22 20:24:47 1997
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun 7 16:57:37 2004
Subject: Lark
Message-ID: <3.0.32.19970322122220.009b8890@pop.intergate.bc.ca>
At 10:11 AM 3/22/97 GMT, Peter Murray-Rust wrote:
>Tim, and xml-dev'ers
>I am using the Jan 3 version of Lark. Is there a later version?
>If so, the rest of this posting may be ignored at present.
I will be posting another version of Lark this weekend. It handles
CMSMQ's Torture doc, and has dozens of bugs fixed. Let's have another
look Monday. -Tim
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From tbray at textuality.com Sat Mar 22 22:16:05 1997
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun 7 16:57:37 2004
Subject: Uncertainties in implementing WD-xml.html
Message-ID: <3.0.32.19970322141420.009c3430@pop.intergate.bc.ca>
>I share Norbert's concern about uncertainties in the XML draft
For the record: us too. Our feeling is that, as Norbert suggests,
once WWW6 hits, XML is de facto frozen because there will be more than
just our little family doing implementations. As a result, Michael
S-McQ and I, and the ERB, are plowing through all the issues
like mad; Michael and I spent a half-day together Friday plowing through
all the little syntax errors and style problems that people sent in;
Murata Makoto is the best proof-reader of all, by the way. The right
thing to do is pretty clear in almost all cases (except bloody horrible
parameter entities) and right or wrong, we *must* have a solid spec
by March 31... the decisions are almost certainly going to disappoint
some of you, but at least we'll have the virtue of the thing being
solid.
- Tim
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From tbray at textuality.com Mon Mar 24 03:36:28 1997
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun 7 16:57:37 2004
Subject: Lark, V0.88
Message-ID: <3.0.32.19970323193500.009c5de0@pop.intergate.bc.ca>
I have just promoted Lark V0.88 to:
http://www.textuality.com/Lark/
I have received a long-names zip, so please ignore the comments asking
for one... some weird bug in my win95 is keeping it from working.
What's new:
- dozens of bug fixes (now passes CMSMQ's Torture.xml, among others, passes
lots of Jon's docs.sun.com stuff, except the ones that are not
well-formed
- full default attribute processing, as a result of which, there's another
15k of code
- does new 몾 unicode character refs
- does new comments
- it's twice as big.
Still to do:
- parameter entities (yeccch)
- make it into a Java package
- make it into an applet
- spruce up the unicode
- do entities in attribute values
Cheers, Tim Bray
tbray@textuality.com http://www.textuality.com/ +1-604-708-9592
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From nmikula at edu.uni-klu.ac.at Mon Mar 24 08:21:05 1997
From: nmikula at edu.uni-klu.ac.at (Norbert H. Mikula)
Date: Mon Jun 7 16:57:37 2004
Subject: JUMBO
References: <4939@ursus.demon.co.uk>
Message-ID: <3336AFF1.1A23@edu.uni-klu.ac.at>
Peter Murray-Rust wrote:
> NXP has an interface Esis, with function such as open_tag, close_tag,
> process_instruction, etc. [I think they would be more properly called
> start_element??].
You are absolutely right !
> JUMBO uses this to build up a Vector representing the
> ESIS event stream, somthing like:
> "_START_TAG" "CML" AttributeList "_START_TAG" "MOL" ... "_END_TAG" "MOL"...
> JUMBO then builds a tree out of this, adding attributes, etc.
>
> NXP has a class XML which is built by JACC. This contains inter alia
> an Esis_Stdout object (implements Esis). There are several objects in XML
> which are private and therefore not easily accessed -
Would it be possible to send me a list of those objects ?
> I think they should
> have accessors, but at present I have subclassed it to PMRXML, which has
> the requisiste accessors.
>
> My test program then creates a PMRXML object, and extracts the event stream
> which is then passed to JUMBO's existing tree object:
> NXP.PMRXML xml = new PMRXML(NXP.Streams.load_File(file, true));
> pmr.chemime.ChemTree chemTree = new ChemTree(xml.getStreamVector());
> pmr.sgml.GeneralTOC toc = chemTree.createGeneralTOC(3);
>
> Comments: I have still to work out what whitespace NXP creates - there seems
> to be a lot of content which is simply white. Maybe we have to address
> COLLAPSE and KEEP at this stage?
As soon as I will know how the standard defines the treatment of
whitespace
in all those scenarios, for instance w/ DTD w/o DTD, in element content
etc.
I will implement it that way. (I admit that the whitespace is really
annoying, but
I didn't want to waste my time with experiments.)
> Also it isn't easy to extract certain
> info - for example I had to hack XML.java to get the doctype - this isn't a good
> idea and we need an accessor.
People didn't seem to be too interested in my idea of an interface for
passing along a complete grove. At least I didn't get too much
feedback.
> I am also still not clear how NXP does (or should)
> behave with:
>
> and
> (the default on the latter is to try to validate, I think, even if validate
> is set to false. I'd prefer to be able to turn off validation, but I may have
> missed something).
I will check it. Thank's for pointing it out to me !
> In general I'd like to be able to treat NXP as a black box, and subclass
> my Esis object. That could mean passing it as an argument to XML, e.g.:
>
> public class PMREsis implements Esis {
> public void open_tag(String name) {
> ...
> }
> }
>
> PMREsis esis = new PMREsis();
> NXP.XML xml = new NXP.XML(esis, NXP.Streams.load_File(file, true))
> pmr.sgml.SGMLTree tree = new pmr.sgml.SGMLTree(xml);
That's the basic idea that I had in mind. We really must continue with
working on our unified interface for XML/Java based applications.
--
Best regards,
Norbert H. Mikula
=====================================================
= SGML, DSSSL, Intra- & Internet, AI, Java
=====================================================
= mailto:nmikula@edu.uni-klu.ac.at
= http://www.edu.uni-klu.ac.at/~nmikula
=====================================================
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From Peter at ursus.demon.co.uk Mon Mar 24 09:51:08 1997
From: Peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun 7 16:57:37 2004
Subject: JUMBO
Message-ID: <4975@ursus.demon.co.uk>
In message <3336AFF1.1A23@edu.uni-klu.ac.at> "Norbert H. Mikula" writes:
> Peter Murray-Rust wrote:
> > NXP has an interface Esis, with function such as open_tag, close_tag,
> > process_instruction, etc. [I think they would be more properly called
> > start_element??].
>
> You are absolutely right !
I learnt the importance of precise terminology when Erik Naggum was a
regular contributor to c.t.s.:-) :-) He used to point out gently but firmly any
lapse in terminology. The problem with SGML is that its terminology is
sufficiently different from other disciplines that people make guesses and
also don't realise the distinctions matter. (I have been very guilty in
this respect). However there are a number of areas where the distinction
is subtle - I still don't know if there is a difference between 'GI' and
'Element type', for example.
There is no doubt that adherence to the agreed terminology is a key aspect
of the API.
>
> > JUMBO uses this to build up a Vector representing the
> > ESIS event stream, somthing like:
> > "_START_TAG" "CML" AttributeList "_START_TAG" "MOL" ... "_END_TAG" "MOL"...
> > JUMBO then builds a tree out of this, adding attributes, etc.
> >
> > NXP has a class XML which is built by JACC. This contains inter alia
> > an Esis_Stdout object (implements Esis). There are several objects in XML
> > which are private and therefore not easily accessed -
>
> Would it be possible to send me a list of those objects ?
/** start of NXP/PMR list */
// from NXP with PMR comments
package NXP;
//...
public class XML implements XMLConstants {
// PMR - I guess most of these would be valuable.
// note that unless they are 'protected' they can't be acccessed
// by a subclass from another package
// '//?' means that I don't know what they are for yet (I haven't spent
// time looking :-)
// '//+' means I need them
// '//+?' means I think I might need them :-)
//+?
XMLCatalogMain catalog = null;
//?
boolean start = true;
//?
int state_counter = 0;
//+
static protected boolean validate = false;
//+
static protected boolean talkative = false;
//?
final static int NO_SWITCH = -1;
final static protected int ALL = 0;
final static protected int INTERNAL = 1;
final static protected int NONE = 2;
static protected int rmd = ALL;
//+
final static protected Esis_Stdout esis = new Esis_Stdout();
//+?
final static protected Hashtable element_hash = new Hashtable(30);
//+?
final static protected Hashtable open_element_hash = new Hashtable();
//+?
final static protected Hashtable notation_hash = new Hashtable(5);
//+?
final static protected Hashtable id_hash = new Hashtable(100);
//+?
final static protected Hashtable idref_hash = new Hashtable(100);
//?
static protected Element open_el = null;
//?
static protected Vector att_val = new Vector();
//?
final static protected Hashtable found_attributes = new Hashtable();
//?
final static protected Hashtable gen_entity_hash = new Hashtable(10);
//?
final static protected Hashtable par_entity_hash = new Hashtable(10);
//?
final static protected Stack lexer_stack = new Stack();
//?
final static protected Stack openel_stack = new Stack();
//?
static protected String stop_external = null;
//?
final static int GENERAL = 0;
final static int PARAMETER = 1;
final static Element NULL_ELEMENT = new Element();
//+
static String base_url;
//+
static String base_path;
//+
static boolean base = true;
//+
final static int URL_INPS = 0;
//+
final static int FILE_INPS = 1;
//+
static int input_stream;
//?
final static Object DUMMY = new Object();
//+ (I had tp add this to the XML code :-(
protected String pmrDoctype;
final void popTokenManager()
{
XMLTokenManager tok_man = (XMLTokenManager) lexer_stack.pop();
ReInit(tok_man);
}
//+ (Note that this is NOT accessible to a subclass, and as it is final
// cannot be overridden)
final void setCatalog(XMLCatalogMain catalog)
{
this.catalog = catalog;
}
....
// This was my own class PMRXML, which I added to NXP.
package NXP;
import java.io.InputStream;
import java.util.Vector;
import NXP.Catalog.XMLCatalogMain;
public class PMRXML extends XML {
public PMRXML(InputStream is) {
super(is);
}
public void setTalkative(boolean t) {
talkative = t;
}
public void setValidate(boolean t) {
validate = t;
}
public static int FILE_INPS() {
return XML.FILE_INPS;
}
public static int URL_INPS() {
return XML.URL_INPS;
}
public void setBaseUrl(String u) {
base_url = u;
}
public String getBaseUrl() {
return base_url;
}
public void setBasePath(String p) {
base_path = p;
}
public String getBasePath() {
return base_path;
}
public void setBase(boolean b) {
base = b;
}
public void setInputStream(int is) {
input_stream = is;
}
// the 'junk' was to avoid the same signature as setCatalog above
// which is 'final'
public void setCatalog(XMLCatalogMain c, String junk) {
this.catalog = c;
}
public Vector getStreamVector() {
return esis.vector;
}
public String getDoctype() {
// this was just to get it to run.
if (pmrDoctype == null) pmrDoctype = "CML";
return pmrDoctype;
}
}
/** end of NXP/PMR */
>
[...]
> >
> > Comments: I have still to work out what whitespace NXP creates - there seems
> > to be a lot of content which is simply white. Maybe we have to address
> > COLLAPSE and KEEP at this stage?
>
> As soon as I will know how the standard defines the treatment of
> whitespace
> in all those scenarios, for instance w/ DTD w/o DTD, in element content
> etc.
> I will implement it that way. (I admit that the whitespace is really
> annoying, but
> I didn't want to waste my time with experiments.)
Agreed. I have (pragmatically) deleted all elements from NXP which consist
only of whitespace. (This because my DTDs are biassed to this since the chance
of getting a molecular scientist to know and love the SGML whitespace/RE/RS
rules is outwith the 2nd law of thermodynamics.
>
> > Also it isn't easy to extract certain
> > info - for example I had to hack XML.java to get the doctype - this isn't a good
> > idea and we need an accessor.
>
> People didn't seem to be too interested in my idea of an interface for
> passing along a complete grove. At least I didn't get too much
> feedback.
(a) some people (e.g. me) didn't know what a complete grove was :-)
(b) I think we were worried about overkill before we have got the plane off
the ground.
(c) I am not sure I would recognise a doctype within a complete grove :-)
most of the names seemed to have come out of a FORTRAN program (i.e. 6
consonants)
>
> > I am also still not clear how NXP does (or should)
> > behave with:
> >
> > and
> > (the default on the latter is to try to validate, I think, even if validate
> > is set to false. I'd prefer to be able to turn off validation, but I may have
> > missed something).
>
> I will check it. Thank's for pointing it out to me !
Great.
>
> > In general I'd like to be able to treat NXP as a black box, and subclass
> > my Esis object. That could mean passing it as an argument to XML, e.g.:
> >
> > public class PMREsis implements Esis {
> > public void open_tag(String name) {
> > ...
> > }
> > }
> >
> > PMREsis esis = new PMREsis();
> > NXP.XML xml = new NXP.XML(esis, NXP.Streams.load_File(file, true))
> > pmr.sgml.SGMLTree tree = new pmr.sgml.SGMLTree(xml);
>
> That's the basic idea that I had in mind. We really must continue with
> working on our unified interface for XML/Java based applications.
Splendid.
NXP has behaved fine on my document instances (but they aren't torturing it!)
It also seems to be fast - at least *much* faster than my own stuff.
Partly that is due to building a tree which I think hammers the memory, so
anything that helps at parse time would be useful.
The only thing I can recall that the WG might consider is character entities
NXP announces that > cannot be resolved. My own feeling is that parsers
should be at liberty to insert these as a default option (perhaps a commandline
switch '-e assume that '> is included).
--
Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences
http://www.vsms.nottingham.ac.uk/
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From Ingo.Macherius at tu-clausthal.de Mon Mar 24 11:02:05 1997
From: Ingo.Macherius at tu-clausthal.de (Ingo Macherius)
Date: Mon Jun 7 16:57:37 2004
Subject: Writing about XML
Message-ID: <199703241101.MAA21321@florix.rz.tu-clausthal.de>
I am preparing a newspaper article on XML for a well known German computer
magazine. The editor mentioned a W3C press release on the topic, which I
canīt find anywhere. Can someone point me to it/send it ? Have I missed
any pointers that canīt be found on www.w3.org, www.sil.org or
www.textuality.com ?
Thanks in advance,
++im
--
Snail : Ingo Macherius // L'Aigler Platz 4 // D-38678 Clausthal-Zellerfeld
Mail : Ingo.Macherius@tu-clausthal.de WWW: http://www.tu-clausthal.de/~inim/
Information!=Knowledge!=Wisdom!=Truth!=Beauty!=Love!=Music==BEST (Frank Zappa)
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From sgmlsh at CAM.ORG Mon Mar 24 13:50:05 1997
From: sgmlsh at CAM.ORG (Sam Hunting)
Date: Mon Jun 7 16:57:37 2004
Subject: JUMBO
In-Reply-To: <3336AFF1.1A23@edu.uni-klu.ac.at>
Message-ID:
> People didn't seem to be too interested in my idea of an interface for
> passing along a complete grove. At least I didn't get too much
> feedback.
I'm interested -- isn't it true that a grove is the best way to prove (as
opposed to asserting, or wishing) that an XML instance really is an SGML
subset? Or to show where it is "impure"? (Certainly superior to parsing
the instance with one or aonther parser and looking at the errors.)
This would be important if a supplier is under a contractual obligation to
provide SGML to a buyer, and wishes to provide XML.
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From ebaatz at barbaresco.East.Sun.COM Mon Mar 24 14:59:56 1997
From: ebaatz at barbaresco.East.Sun.COM (Eric Baatz - Sun Microsystems Labs BOS)
Date: Mon Jun 7 16:57:37 2004
Subject: Restriction on PI information
Message-ID:
My application of XML is to markup text that is to be spoken by
speech synthesizers. To my naive mind (I'm very new to SGML
and XML), a PI seems to be the right construct for passing
native information to a speech synthesizers, that is, instructions
in their proprietary, already existing, command set. As I don't
have any control over the syntax of the commands I want to
pass through, I want a PI to allow the widest latitude in
the information it can handle. The syntax in the draft doesn't
seem to allow that.
What is the rationale for the data that a PI allows?
What mechanisms can be used to make that data as arbitrary
as possible without changing the draft?
My take on the PI syntax is that the data needs to avoid
looking like the end of a PI. Two different ways of ending
a PI (somewhat like the use of double or single quotes for
quoted data) would allow a way of getting unpalatable data
through (my program would have to generate the appropriate
one depending on what my data looked like). Allowing a CDATA
section, would also seem to allow quoting of otherwise
unpalatable data.
Clearly, any changes from the draft would complicate the parsing.
Eric Baatz
Sun Microsystems Laboratories
2 Elizabeth Drive, MS UCHL03-207 (508) 442-0257
Chelmsford, MA 01824 fax: (508) 250-5067
USA Internet: eric.baatz@east.sun.com
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From Peter at ursus.demon.co.uk Mon Mar 24 16:48:16 1997
From: Peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun 7 16:57:37 2004
Subject: Lark
Message-ID: <4986@ursus.demon.co.uk>
Tim,
Thanks very much for the latest Lark. I have run it on a
medium-sized file 20Kb and a few hundred nodes and it performs fine.
Some very minor comments for distribution:
(a) It would be really useful to have it as a package - all you have to
do is add
package lark;
at the head of each file. This means that the compiled classes can be located
in standard libraries, etc. At present I have
/myclasslib/pmr/sgml/*.class
/myclasslib/NXP/*.class
/myclasslib/NXP/Catalog/*.class
etc.
and it would be valuable to have
/myclasslib/lark/*.class
Secondly it means that it's easier to distribute classes in a robust fashion.
If there is a clear API then developers can subclass rather than hack the
code - this is what I'd like to aim towards myself, so I'm happy to treat
lark and NXP as black boxen. So, at the least, this could be done for
Lark and Namer.
The problem in packages come when:
there is some internal that people want to access. This results from
an insufficiently developed API
there is some complex dependency between classes. If you have
A importing B
B importing A
then something is probably wrong. (It's also difficult
to compile unless you do them simultaneously.
I have about 10 packages in JUMBO, which took some sorting out. I believe
that they have to be arranged as a DAG - I'm sure there is years of theory
about this. Wherever I had trouble forcing them into a DAG it revealed itself
as a design fault :-)
(b) It still doesn't like the valid construction (prod. [32])
it requires ExternalID. [Unfortunately if I create a file like
and give it to NXP, NXP insists on *validating it* :-)]
-------------------
The unpacked files have ^M at record ends (this isn't a problem for me) and
some are missing and EOL and EOF. Again not a problem.
Also it might be helpful if the files were packaged under a directory
such as V088/ (giving V088/Lark.class) so that when unpacked there was no
confusion between versions.
P.
--
Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences
http://www.vsms.nottingham.ac.uk/
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From lee at sq.com Mon Mar 24 18:25:46 1997
From: lee at sq.com (lee@sq.com)
Date: Mon Jun 7 16:57:37 2004
Subject: JUMBO
Message-ID: <9703241825.AA23913@sqrex.sq.com>
> > Peter Murray-Rust wrote:
> I still don't know if there is a difference between 'GI' and
> 'Element type', for example.
In XML they are the same, as far as I can tell.
The detailed reasoning follows, but you can ignore it if you like....
Lee
*
An element type can be a generic identifier, a name group,
a ranked element or a ranked group. [117; p. 406 of the SGML Handbook]
We don't have RANK in XML, so
An element type can be a generic identifier or a name group.
For example,
in SGML defines the content for both boy and girl.
This is (I think) not allowed in XML, so in XML there is no practical
difference between a GI and an element type.
See also the definition of a GI:
The idea seems to be that a generic identifier specification is used
to give in an instance the type of an element, once the parser has
determined that an element is beginning to happen. The terminology
seems so obfuscatory to me that I see no benefit to the distinction for
SGML itself, let alone for XML, but maybe that is because I lack a legal
background :-)
If you have difficulty with some of the SGML terminology, also bear in
mind that (1) people who have been working with SGML for years also
have difficulty with it, (2) some of the WG8 people also seem to have
difficulty with it, and (3) I do not believe that there is 100% total
agrement on what it means even among the original SGML committee, at
least not at a technical nuts-and-bolts level. The only consolation
is that HyTime terminology is far, far worse :-) :-)
Lee
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From dseibert at sqwest.bc.ca Mon Mar 24 18:54:00 1997
From: dseibert at sqwest.bc.ca (David Seibert)
Date: Mon Jun 7 16:57:37 2004
Subject: Restriction on PI information
Message-ID: <01BC3841.8F32EBC0@sqruffy.west.sq.com>
The simplest alternative is to encode your data (with any encoding
that won't produce the character "?"), insert it in a PI, and decode
it at the other end. This is probably also the most reliable way to
solve this problem. If there were two ways to terminate a PI,
what would your aplication do with data that contained both
terminators?
Regards,
David
----------
From: Eric Baatz - Sun Microsystems Labs BOS
Sent: Monday, March 24, 1997 6:55 AM
To: xml-dev@ic.ac.uk
Cc: ebaatz@barbaresco.East.Sun.COM
Subject: Restriction on PI information
My application of XML is to markup text that is to be spoken by
speech synthesizers. To my naive mind (I'm very new to SGML
and XML), a PI seems to be the right construct for passing
native information to a speech synthesizers, that is, instructions
in their proprietary, already existing, command set. As I don't
have any control over the syntax of the commands I want to
pass through, I want a PI to allow the widest latitude in
the information it can handle. The syntax in the draft doesn't
seem to allow that.
What is the rationale for the data that a PI allows?
What mechanisms can be used to make that data as arbitrary
as possible without changing the draft?
My take on the PI syntax is that the data needs to avoid
looking like the end of a PI. Two different ways of ending
a PI (somewhat like the use of double or single quotes for
quoted data) would allow a way of getting unpalatable data
through (my program would have to generate the appropriate
one depending on what my data looked like). Allowing a CDATA
section, would also seem to allow quoting of otherwise
unpalatable data.
Clearly, any changes from the draft would complicate the parsing.
Eric Baatz
Sun Microsystems Laboratories
2 Elizabeth Drive, MS UCHL03-207 (508) 442-0257
Chelmsford, MA 01824 fax: (508) 250-5067
USA Internet: eric.baatz@east.sun.com
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From cgirard at ags.com Mon Mar 24 19:16:21 1997
From: cgirard at ags.com (Girard, Craig)
Date: Mon Jun 7 16:57:37 2004
Subject: Beginner
Message-ID:
Does anyone know a good site to find information on XML for a beginner?
Preferably something with tutorials.
Craig Girard
Electronic Product Technician
Automated Graphic Systems
800-678-8760 x512
www.ags.com
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From russc at watfac.org Mon Mar 24 19:33:19 1997
From: russc at watfac.org (Russ Chamberlain)
Date: Mon Jun 7 16:57:37 2004
Subject: GI vs. Element Type (Was: RE: JUMBO)
Message-ID: <01BC3860.3A99B8E0@watfac16.watfac.org>
Hello XMLers,
> lee@sq.com wrote:
>> > Peter Murray-Rust wrote:
>> I still don't know if there is a difference between 'GI' and
>> 'Element type', for example.
>
>In XML they are the same, as far as I can tell.
>
>The detailed reasoning follows, but you can ignore it if you like....
>
>Lee
>
>*
>
>An element type can be a generic identifier, a name group,
>a ranked element or a ranked group. [117; p. 406 of the SGML Handbook]
>
>We don't have RANK in XML, so
>An element type can be a generic identifier or a name group.
>
>For example,
>
>in SGML defines the content for both boy and girl.
>
>This is (I think) not allowed in XML, so in XML there is no practical
>difference between a GI and an element type.
Not quite true. You can achieve the identical effect with the following:
Here's the verbatim definitions from my ISO 8879 spec:
4.114 element type: A class of elements having similar
characteristics; for example, paragraph, chapter, abstract, footnote,
or bibliography.
4.145 generic identifier: A name that identifies the element type
of an element.
4.146 GI: generic identifier.
So, GI <==> generic identifier <==> element type. Or am I missing something (not so) obvious here?
So, are boy and girl of the same element type? The definitions imply (I think) that an element type is identified by a unique GI, and vice versa, so it looks to me that there should be no distinction between the two. Thus, boy and girl are of different element types (and have different GIs). Please correct me if I misunderstand.
>See also the definition of a GI:
(See above)
Perhaps there was some previous distinction between the two that is now lost in time? Is this an example of legacy terminology?
>The idea seems to be that a generic identifier specification is used
>to give in an instance the type of an element, once the parser has
>determined that an element is beginning to happen. The terminology
>seems so obfuscatory to me that I see no benefit to the distinction for
>SGML itself, let alone for XML, but maybe that is because I lack a legal
>background :-)
I (me and myself) do hereby instantiate my total concurrence with your most perspicacious and eloquent statement regarding obfuscation in SGML. (Agreed ;-)
Since when do lawyers write programs? Never! ;-)
They have lackeys (us) to do that for them, and if the lackeys can't read the spec, what kind of spec is it? XML, from my perspective, is a minor revolt by the lackeys (and their good friends) to get the spec pared down to something "reasonable". I just hope that the discussions about what is "reasonable" don't lead XML into interminable wrangling. Tim Bray's earlier point about the importance of a (perhaps) imperfect, yet SOLID, specification is well taken.
>[. . .Good point about SGML terminology deleted. . .]
- Russ
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From Peter at ursus.demon.co.uk Mon Mar 24 20:49:35 1997
From: Peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun 7 16:57:38 2004
Subject: Beginner
Message-ID: <5006@ursus.demon.co.uk>
In message "Girard, Craig" writes:
> Does anyone know a good site to find information on XML for a beginner?
> Preferably something with tutorials.
>
This is an important question and it's something that a number of xml-dev'ers
have addressed in some way. Remember that we all learn in different ways, so
what I write may not fit your needs.
Firstly it's important for you to assess you needs in the light of what you
already know, for example are you:
- a programmer? (reading FAQ/xml-dev is as good as any, I suspect)
- an informatician, with some acquaintance of SGML? Then you need
to know what's different (and simpler). The FAQ covers
most of it.
- a newcomer to the field of structured information? I can offer some
simple tutorials and examples under:
http://www.venus.co.uk/omf/cml
Although there is a molecular bias, there are several that
make general sense. The XML is a simple subset (i.e. no
parameter entities, CDATA, marked sections, PIs, etc.)
There is also a tutorial on structured documents in general.
You may also find some of the SGML material useful, so long as you
simply take the principles. There is very little that can't be done in XML,
so things like 'A gentle introduction to SGML' (see Robin Cover's page:
http://www.sil.org/sgml for this and other introductory material might be
useful. The gentle Intro isn't XML, but not far off.
if you remember that tags must be balanced and attribute values must be
quoted, that's half of what you need to know for simple XML.
I would list the following ways of learning :-) For most of these, look
at the FAQ for links.
- reading the formal specs/BNF. (Yes, this seems unlikely to most of
us, but it's the way that a few people prefer.)
- reading a book. (There aren't any yet :-(
- looking at other people's examples. There are a few referred to from
the FAQ.
- running parsers (on examples). I find this very useful
- trying to develop your own application. You will need the parsers.
The uses of SGML (and therefore XML) are as varied as the uses of C. So I
suspect we shall get books like:
'Learn XML in 7 days and run a killer website'
'Financial applications in XML'
'XML for scientists and engineers'
'Building XML applications'
Finally, do feel free to post to this group. We have all been through this
process and *I* created a fair amount of bandwidth on comp.text.sgml when I
was learning:-). The community is extremely helpful. It also brings *us*
benefits, because we realise what things people are likely to find difficult
and how to present our programs and examples.
P.
--
Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences
http://www.vsms.nottingham.ac.uk/
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From Peter at ursus.demon.co.uk Mon Mar 24 20:49:40 1997
From: Peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun 7 16:57:38 2004
Subject: GI vs. Element Type (Was: RE: JUMBO)
Message-ID: <5007@ursus.demon.co.uk>
These are extremely valuable contributions. I also had another mail which
confirmed that there were subtly different (the GI is the _name_ of an element
type, c.f. Lewis Carroll).
It would be extremely useful for us to collect the required terminology for
XML. If someone does it, I'll put it in XML using ISO 12620 terminology
(I have already written the DTD and rendering in JUMBO/CML, see
http://www.venus.co.uk/vhg/
for examples, and I simply need the content.)
Much of the definitions are already in electronic form from - I asked earlier
:-), but the important thing is to know which ones are required for XML. It
could be a much smaller subset.
Good terminology helps the creation of programs and documents, and makes it much
easier for newcomers. For example, there is a constant confusion between
tags, GIs and elements. A pictorial diagram would be very useful here.
I think it's very useful if the components of an API (e.g. Lark uses
Element, Entity, etc. are generally agreed to follow the terminology - I am
sure that Tim has been careful here).
In message <01BC3860.3A99B8E0@watfac16.watfac.org> Russ Chamberlain writes:
> Hello XMLers,
>
> > lee@sq.com wrote:
> >> > Peter Murray-Rust wrote:
> >> I still don't know if there is a difference between 'GI' and
> >> 'Element type', for example.
> >
> >In XML they are the same, as far as I can tell.
[...]
> >
[...]
> Here's the verbatim definitions from my ISO 8879 spec:
>
> 4.114 element type: A class of elements having similar
> characteristics; for example, paragraph, chapter, abstract, > footnote,
> or bibliography.
H'm. So there can be a hierarchy of element types in an SGML document.
>
P.
--
Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences
http://www.vsms.nottingham.ac.uk/
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From Peter at ursus.demon.co.uk Mon Mar 24 21:16:00 1997
From: Peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun 7 16:57:38 2004
Subject: TEI Pointers
Message-ID: <5008@ursus.demon.co.uk>
In a TEI pointer is a string of the form:
FOO (1 DIV2) (3 DIV4) (5 DIV6)
identical to:
FOO (1 DIV2) FOO (3 DIV4) FOO (5 DIV6)
?
In the pointer:
FOO (1 BAR BAZ #IMPLIED)
do we simply interpret the absence of a BAZ attribute in
a BAR element as a match? Without a DTD there is no
information as to whether *could* exist as an attribute.
P.
--
Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences
http://www.vsms.nottingham.ac.uk/
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From tbray at textuality.com Mon Mar 24 22:31:17 1997
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun 7 16:57:38 2004
Subject: Another way to raid Jon's oven
Message-ID: <3.0.32.19970324142932.009c47c0@pop.intergate.bc.ca>
Here's another way to get at Jon's Sun data; create a little XML
file like so:
----------------------
]>
&SunURL;
-----------------------
Then run it through Lark, after doing a
lark.processExternalEntities(true);
Assuming you've got an Internet connection, Lark will go get it,
cheerfully ignoring extensions and mime types and so on; figuring out how
to make Lark copy it to output is left as an exercise for the user.
- Tim
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From Peter at ursus.demon.co.uk Tue Mar 25 09:08:47 1997
From: Peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun 7 16:57:38 2004
Subject: Simple API
Message-ID: <5040@ursus.demon.co.uk>
[announced on comp.text.sgml]
Henry Thompson has posted an impressive picture of what a grove looks like:
http://www.cogsci.ed.ac.uk/~ht/grove.html
It describes the grove for a simple document (2 element types, 2 elements)
and it's sufficiently complex that only *part* of it is shown.
[I make it clear that I'm impressed by this, but that personally it would take
too much effort to implement for the benefit I would get. Many other readers
of xml-dev will probably find it's exactly what they want].
-----
It highlights for me that the spectrum of possible approaches to the API is
too large to pick an approach that suits everyone. The grove has obviously
enormous power if you take the time to learn it but it is not trivial.
Henry's diagram is much more reader-friendly than 10179, but confirms that
this isn't just a problem of terminology - it's an extra level of complexity.
My own suggestion is that we should produce a ReallySimple API independently
of the grove approach. I'm sure this won't cause a schism - we need something
to test out the language, build simple trees for trying out TEI pointers, etc.
IMO most of the things that are bugging us at the moment are not conceptual but
- how do we implement this bit of the spec?
- how do we read in both Files and URLs (a Java problem)
- how do we cater for applets and applications
- what structure do we hand over at the end? Can it be subclassed?
- how do we get at the DTD? (from a validating parser).
- how do we treat parameter enetities :-)
P.
--
Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences
http://www.vsms.nottingham.ac.uk/
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From h.rzepa at ic.ac.uk Tue Mar 25 10:29:07 1997
From: h.rzepa at ic.ac.uk (Rzepa, Henry)
Date: Mon Jun 7 16:57:38 2004
Subject: XML list now searchable
Message-ID:
The XML archives;
http://www.lists.ic.ac.uk/hypermail/xml-dev/
have now been indexed using WAIS and are searchable.
Dr Henry Rzepa, Dept. Chemistry, Imperial College, LONDON SW7 2AY;
rzepa@ic.ac.uk; Tel (44) 171 594 5774; Fax: (44) 171 594 5804.
URL: http://www.ch.ic.ac.uk/rzepa/
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From ht at cogsci.ed.ac.uk Tue Mar 25 15:57:25 1997
From: ht at cogsci.ed.ac.uk (Henry S. Thompson)
Date: Mon Jun 7 16:57:38 2004
Subject: Simple API
In-Reply-To: Peter@ursus.demon.co.uk's message of Tue, 25 Mar 1997 09:52:48 GMT
References: <5040@ursus.demon.co.uk>
Message-ID: <529.199703251557@grogan.cogsci.ed.ac.uk>
Peter Murray-Rust wrote complementing my picture of a grove fragment
(thanks!) but suggesting that it demonstrated that the grove concept
was too complex to serve as the basis for a minimal XML API.
I suspect the complexity is more apparent (i.e. in the graphics) than
real, stemming from my pedagogically directed efforts to exemplify
nearly everything in a very constrained space. A pretty simple set
of structures and access functions would encapsulate almost all of the
core property set modules. We are currently moving our existing LT
NSL tools (see http://www.ltg.ed.ac.uk/software/nsl/) to support XML,
using the existing API, which was developed 'pre-grove', but covers
most of the necessary information. Watch this space . . .
ht
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From Peter at ursus.demon.co.uk Tue Mar 25 17:15:35 1997
From: Peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun 7 16:57:38 2004
Subject: Simple API
Message-ID: <5054@ursus.demon.co.uk>
In message <529.199703251557@grogan.cogsci.ed.ac.uk> "Henry S. Thompson" writes:
> Peter Murray-Rust wrote complementing my picture of a grove fragment
> (thanks!) but suggesting that it demonstrated that the grove concept
> was too complex to serve as the basis for a minimal XML API.
It actually reminds me of mangroves :-)
I am a very geometrical thinker and so I appreciated the picture.
I would applaud any other efforts to represent things in diagrammatic
form - HenryT did a very useful diagram of 'pointers' for the WG. The
diagram lets me feel my way towards the solution (whereas some people are
capable of abstract thought). Diagrams like this are useful for the
heavy demand we shall get for educational material.
Eliot Kimber has just written quite a lot on groves on c.t.s, which might
be helpful.
P.
--
Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences
http://www.vsms.nottingham.ac.uk/
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From Peter at ursus.demon.co.uk Tue Mar 25 17:36:29 1997
From: Peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun 7 16:57:38 2004
Subject: XML list now searchable
Message-ID: <5058@ursus.demon.co.uk>
In message "Rzepa, Henry" writes:
> The XML archives;
>
> http://www.lists.ic.ac.uk/hypermail/xml-dev/
>
> have now been indexed using WAIS and are searchable.
>
Many thanks Henry,
xml-dev is of great value to the XML community. With new members
continuing to subscribe and this will further preserve it as a
permanent resource.
P.
--
Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences
http://www.vsms.nottingham.ac.uk/
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From jjc at jclark.com Tue Mar 25 19:30:04 1997
From: jjc at jclark.com (James Clark)
Date: Mon Jun 7 16:57:38 2004
Subject: Simple API
Message-ID: <2.2.32.19970325115530.00bd0a0c@jclark.com>
At 09:52 25/03/97 GMT, Peter Murray-Rust wrote:
>My own suggestion is that we should produce a ReallySimple API independently
>of the grove approach.
It's perfectly possible to have a "ReallySimple API" that is based on
groves, for example:
public interface Builder {
SgmlDocument build(String url);
}
public interface Node {
public abstract Node getParent();
public abstract NodeList getChildren();
}
public interface NodeList {
public abstract Node getItem(int i);
public abstract int getCount();
}
public interface NamedNodeList {
public abstract Node getItem(String name);
public abstract NodeList toNodeList();
}
public interface SgmlDocument extends Node {
public abstract NodeList getProlog();
public abstract NodeList getEpilog();
public abstract Element getDocumentElement();
public abstract NamedNodeList getElements();
public abstract NamedNodeList getEntities();
}
public interface Element extends Node {
public abstract String getId();
public abstract String getGi();
public abstract NodeList getContent();
public abstract NamedNodeList getAttributes();
public abstract boolean getMustOmitEndTag();
}
public interface DataChar extends Node {
public abstract char getChar();
}
public interface Pi extends Node {
public abstract String getSystemData();
}
public interface ExternalData extends Node {
public abstract Entity getEntity();
}
public interface AttributeAssignment extends Node {
public abstract NodeList getValue();
public abstract boolean getImplied();
public abstract String getName();
}
public interface AttributeValueToken extends Node {
public abstract String getToken();
public abstract Element getReferent();
public abstract Entity getEntity();
public abstract Notation getNotation();
}
public interface Entity extends Node {
public abstract String getName();
public abstract ExternalId getExternalId();
public abstract String getText();
public abstract Notation getNotation();
}
public interface ExternalId extends Node {
public abstract String getSystemId();
public abstract String getPublicId();
}
public interface Notation extends Node {
public abstract String getName();
public abstract ExternalId getExternalId();
}
Is that really so complicated?
James
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From Peter at ursus.demon.co.uk Tue Mar 25 20:19:22 1997
From: Peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun 7 16:57:38 2004
Subject: Simple API
Message-ID: <5067@ursus.demon.co.uk>
In message <2.2.32.19970325115530.00bd0a0c@jclark.com> James Clark writes:
James,
Thanks very much for taking the time to list this out. As you
imply, most of the concepts map directly onto 'common' SGML terminology.
This is a valuable starting point for people who are developing simple
systems.
[... API deleted...]
>
> Is that really so complicated?
Not when it's presented like this :-). The important thing for all of us
is to make sure that the terminology between different approaches is as
compatible as possible.
P.
--
Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences
http://www.vsms.nottingham.ac.uk/
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From Peter at ursus.demon.co.uk Tue Mar 25 23:03:17 1997
From: Peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun 7 16:57:38 2004
Subject: TEI Pointers
Message-ID: <5068@ursus.demon.co.uk>
I have implemented a first pass at TEI pointers in JUMBO and would be
grateful for any checked examples of the results of applying these.
I am not quite sure where the discussions are at on the WG, and have so
far managed:
ROOT
[HERE]
ID based on attribute *name*, not type
CHILD
DESCENDANT
ANCESTOR
PREVIOUS
NEXT
PRECEDING
FOLLOWING
All these return Elements. I have left placeholders for SPACE and FOREIGN
since it is not clear what they return. I can't remember what the
groundswell of opinion was for PATTERN and in any case its syntax is not
in the draft. Does it use a regex? If so, what?
(BTW, is there a typo in the draft? A1.1.1.6 (CHILD), after the box for
'element', line 3 refers to 'fourth and fifth' and I would think this was
'fifth and sixth')
P.
--
Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences
http://www.vsms.nottingham.ac.uk/
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From bosak at atlantic-83.Eng.Sun.COM Sat Mar 29 01:23:59 1997
From: bosak at atlantic-83.Eng.Sun.COM (Jon Bosak)
Date: Mon Jun 7 16:57:38 2004
Subject: Dev Day demos
Message-ID: <199703290122.RAA01967@boethius.eng.sun.com>
Here (in no particular order) is the list of demos I have lined up for
the implementor's session in the XML track on Developer's Day at the
World Wide Web conference (April 11, 1997) in Santa Clara. Please let
me know immediately if anything has occurred to prevent your
appearance.
I will be sending further details by direct mail this weekend.
ArborText XML editor
Grif XML editor
Inso XML converter, Web server, and local browser
Open Molecule Fndtn. XML processor/renderer
Sun Microsystems XML Web site
ICL XML server
Fujitsu Laboratories XML/DSSSL browser
Tim Bray XML parser
Norbert Mikula XML parser, DSSSL engine
RivCom XML Netscape plug-in
Univ. of Edinburgh XML tools, DSSSL syntax checker
Kevin Grimes XML processor
----------------------------------------------------------------------
Jon Bosak, Online Information Technology Architect, Sun Microsystems
----------------------------------------------------------------------
2550 Garcia Ave., MPK17-101, Mountain View, California 94043
Davenport Group::SGML Open::NCITS V1::ISO/IEC JTC1/SC18/WG8::W3C XML
If a man look sharply and attentively, he shall see Fortune; for
though she be blind, yet she is not invisible. -- Francis Bacon
----------------------------------------------------------------------
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From ser at javalab.uoregon.edu Sun Mar 30 07:43:59 1997
From: ser at javalab.uoregon.edu (Sean Russell)
Date: Mon Jun 7 16:57:38 2004
Subject: Entity replacement
Message-ID: <199703300546.VAA18376@javalab.uoregon.edu>
I was told this was a hotly debated topic, and I was wondering what the current status was. This is regarding section 4.3 of the XML working draft, 14-Nov-96.
As regards #2, #3, and #6, which claim that internal entities should be processed and replaced by their values by the parser before the data is returned to the application, will this requirement be changed?
--- SER
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From Peter at ursus.demon.co.uk Sun Mar 30 13:12:19 1997
From: Peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun 7 16:57:38 2004
Subject: Entity replacement
Message-ID: <5246@ursus.demon.co.uk>
In message <199703300546.VAA18376@javalab.uoregon.edu> Sean Russell writes:
> I was told this was a hotly debated topic, and I was wondering what the cur
Probably by me :-), so I'll try to answer.
> rent status was. This is regarding section 4.3 of the XML working draft, 1
> 4-Nov-96.
>
> As regards #2, #3, and #6, which claim that internal entities should be pro
> cessed and replaced by their values by the parser before the data is return
> ed to the application, will this requirement be changed?
The problem as I see it is not that anything requires change, but rather
clarification. The main problem seems to be with parameter entities.
The sort of problem that *I* don't know the answer to is whether a
parameter entity in a comment is expanded or whether a comment in a
parameter entity is expanded and in which order. Another is that PEs can
be nested something like:
">
but I doubt if I have got this right (I've deleted the WG discussion).
I also know from experience that *authoring* PEs can be quite tricky
(I used this at one stage to mimic directory names in resolving entities
and you have to get the quotations just right). It's therefore even more
important to get the parsing right :-)
The Editorial Review Board has promised enlightenment in the nearish future.
If your are really keen on this have a play with a full sgml parser such as
nsgmls and see how PEs are treated there.
P.
>
> --- SER
>
> xml-dev: A list for W3C XML Developers
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
> To unsubscribe, send to majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
>
>
--
Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences
http://www.vsms.nottingham.ac.uk/
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From Peter at ursus.demon.co.uk Sun Mar 30 18:39:31 1997
From: Peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun 7 16:57:38 2004
Subject: XML-LINK
Message-ID: <5250@ursus.demon.co.uk>
I am trying to think about how to (a) write and (b) process documents that
use the XML-LINK syntax. (For those who aren't up-to-date with the WG's
discussions, LINK is being discussed at this moment. Some parts of it
seem to have coalesced (i.e. not much visible discussion) and I'd like some
clarification on some points in the draft - I didn't manage to understand all
the discussion.) I am not seeking to re-open things which have been decided,
but rather to know how it should be used. I realise that some of the
discussion postdates the doc I am working from: (WD-xml-link-970305) at:
http://www.textuality.com/sgml-erb/WD-xml-link.html.
I'd be very grateful for comments on the following - I'll refer to sections.
These are my understandings - please correct them :-)
2.1 The preferred mechanism is to have multiple attributes for elements which
carry the XML-LINK attribute. This is currently illegal in SGML, although
this is likely to change and XML is rather hoping this will be soon. So a
DTD might have an A element (similar to HTML):
and later
To get round the illegality it would be allowed (though messy) to combine
these into a single ATTLIST.
***If SGML is not revised, the *parser* would have to process multiple
attlists***. The documents before parsing would not be valid SGML.
For a well-formed document the link attributes may have to be inserted
in the document. No changes are necessary for the parser.
NOTE: If XML-LINK is added from the DTD, all XML-LINK values are identical
("#FIXED"). If they are added within the document, the *could* have different
values and the parser would not complain, but this seems to break the spirit
of the draft. If ATTLIST A XML-LINK is provided in the DTD, then any
attributes in the document must be #FIXED (?), and so are redundant. If they
do not agree it's an error (even in a WF document?).
Table 3.2 I have difficulty in understanding this, especially the very similar
terms LINK, XML-LINK, XLINK and XML-XLINK. My understanding is this:
The table does not (although it appears to) define an XML-XLINK element. My
understanding is that 'XML-XLINK' is a generic variable replaceable by 'FOO'
or whatver for as many elements as the DTD author requires. So in the above
example, 'XML-LINK' would be replaced by 'A'. (I assume the same for
XML-LOCATOR, XML-LINK, XLG and XLD). (The five tables in 3.2, 3.3, 6.1
correspond to the five allowed values of XLINK, which must be #FIXED for each).
There is a different number of attributes for each of the five types (given
in the tables). If an attribute occurs in more than one of the five it always
has the same form apart from XML-LINK.
Elements with the attribute XML-LINK="XLINK" have a content model which
can only include #PCDATA or 'XML-LOCATOR'. Since 'XML-LOCATOR' is determined
by its XML-LINK attribute value and not by its GI a normal SGML parser cannot
detect this. [A similar argument holds for elements with XML-LINK="XLG"].
***The *?parser?* will have to determine whether elements with attribute
XML-LINK="XLINK" contain only elements with attribute XML-LINK="LOCATOR"
(or #PCDATA). This presumably has to hold for well-formed documents without
an internal DTD, but with explicit attribute values. Or does the parser
simply look for well formedness and leave this slightly hairy problem to
the application/link_processor?***
If no ENTITY is defined for FOO, and appears
in the document, what happens? Is the parser or application required to
detect this as a reserved attribute ***and fill in all the others in the draft
for that XML-LINK type?***
So, assuming this is on the right lines, there are three uses of XML-LINK:
(a)LINK by itself
(b)XLINK/LOCATOR working together
(c)XLG/XLD working together.
I presume the syntax looks something like:
(a) (Assume element A as above):
This is the Home of the
Elephant house and the
and similarly for %LOCATOR-attribs
in the document:
In the we can find
and
and some monkeys
tomorrow
If I am anywhere near right, this means that:
The text (#PCDATA) will be displayed along with an image of Nellie and a button
(application-defined) to JUMBO and the monkeys. When JUMBO-button is pressed,
then an additinal window (the Jumbo browser?) is launched. When the monkey
buton is pressed the current window disappears to be replaced by gibbering.
Presumably the application decides whether the Jumbo browser is killed at
this stage.
[I am not clear what the SHOW/ACTUATE, etc. do for the XML-XLINK container.
Presumably the contents could be hidden until a button was pressed? In the
example, the whole contents of ZOO would be INCLUDEd in the Paragraph?]
(c) is a list of documents and presumably straightforward?
It would be useful to have comments and other examples for XML-LINK as it may
impact on XML-LANG. For example, a DTD should not be designed with attributes
such as ROLE, TITLE, SHOW if it is likely to be used for XML-LINK at a later
stage - perhaps this should appear in the draft?
P.
--
Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences
http://www.vsms.nottingham.ac.uk/
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From dseibert at sqwest.bc.ca Mon Mar 31 19:52:48 1997
From: dseibert at sqwest.bc.ca (David Seibert)
Date: Mon Jun 7 16:57:38 2004
Subject: Entity replacement
Message-ID: <01BC3DB9.21A3AC70@sqruffy.west.sq.com>
1) PEs aren't supposed to be recognized inside comments (nothing is except the
terminal '*-->'), so they aren't supposed to be expanded. This is also true for
entities in cdata.
2) Expansion of comments inside parameter entities shouldn't matter, since
comments can go anywhere that PEs can. The exact treatment could probably
depend on the application handling the document.
David
----------
From: Peter Murray-Rust
Sent: Sunday, March 30, 1997 3:55 AM
To: xml-dev@ic.ac.uk
Subject: Re: Entity replacement
In message <199703300546.VAA18376@javalab.uoregon.edu> Sean Russell writes:
> I was told this was a hotly debated topic, and I was wondering what the cur
Probably by me :-), so I'll try to answer.
> rent status was. This is regarding section 4.3 of the XML working draft, 1
> 4-Nov-96.
>
> As regards #2, #3, and #6, which claim that internal entities should be pro
> cessed and replaced by their values by the parser before the data is return
> ed to the application, will this requirement be changed?
The problem as I see it is not that anything requires change, but rather
clarification. The main problem seems to be with parameter entities.
The sort of problem that *I* don't know the answer to is whether a
parameter entity in a comment is expanded or whether a comment in a
parameter entity is expanded and in which order. Another is that PEs can
be nested something like:
">
but I doubt if I have got this right (I've deleted the WG discussion).
I also know from experience that *authoring* PEs can be quite tricky
(I used this at one stage to mimic directory names in resolving entities
and you have to get the quotations just right). It's therefore even more
important to get the parsing right :-)
The Editorial Review Board has promised enlightenment in the nearish future.
If your are really keen on this have a play with a full sgml parser such as
nsgmls and see how PEs are treated there.
P.
>
> --- SER
>
> xml-dev: A list for W3C XML Developers
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
> To unsubscribe, send to majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
>
>
--
Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences
http://www.vsms.nottingham.ac.uk/
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)