From cbullard at hiwaay.net  Sun Feb  1 00:34:02 1998
From: cbullard at hiwaay.net (len bullard)
Date: Mon Jun  7 17:00:03 2004
Subject: XSL/XML/XLL and VRML (was: Re: Conditional actions in XSL?)
References: <4955E202FE46D11195C500609712EB6B05C193@FLPS-NTSERVER1>
Message-ID: <34D3C277.5002@hiwaay.net>

Tony Stewart wrote:
> 
> Len Bullard wrote:
> 
> >>It can do what DTDs do well:  provide a precise description of the
> presentation style of the interface as a set of routed behaviors.
> 
> I would have thought that a good DTD doesn't do this at all. The DTD
> should define the information content, leaving both style and (IMO)
> behavior to be specified in a stylesheet that is tailored to this
> specific usage of the information. 

> Thus, it is the style sheet describes
> the presentation style, not the DTD. Otherwise, how are you going to
> reuse the information in other formats? You're not going to want to
> change the DTD. And you may not have permission to do so in any case.
> 
> Since this is all pretty basic religious thinking, perhaps I
> misunderstood you.

One could say that it is a religious conviction in some cases and 
be quite right, and in others, it is an engineering constraint and 
be right.  It is the *SGML Way*.  In that sense, yes, it is a religion, 
and for some years, I practiced it.  "But what is the good, Phaedrus?"

Look at what you are saying:

1.  Stylesheet properties are not "information"

2. Stylesheets express behaviors.  So in fact, a stylesheet 
   language is a programming language, Turing complete if you will.

3.  For some kinds and instances of information, there are lifecycle 
    requirements for reuse.

4.  For some kinds and instances of information (DTDs in your example), 
    there are policies for the behaviors that can be applied to the 
    kinds and instances of information.

1.  I don't think you intend one.  But it is often a hidden premise in 
the debates about separating style from content (which is what you are
using information).  That distinction proves to be thin.  Perhaps 
by stylesheet information, you mean, typographic properties.

2.  Stylesheets that express behaviors are simply programming languages 
with structures (data types) for typographic properties.  In this 
view, Java/AWT et al is a stylesheet language.  After that, choosing one 
comes down to practical engineering requirements of platforms,
libraries, 
interoperation with other engines, etc.  Anyway, in this view, VRML 
is a stylesheet language.  Perhaps the best way for it to include 
text support is to include it natively.  This idea has come up and 
there is a text node in VRML which browsers like WorldView can display 
very well.  (NOTE:  The issue of reformulating VRML as XML is one 
of the framework efficiency, not descriptive power or lifecycle.)

3.  This is true of course.  But unless requirements are very carefully 
examined, no size fits all.

4.  True and it varies widely.  One of the features of DTDs that make
them 
very attractive for policy is the ease with which they can 
be adjusted liberally on site of use.  This one slips by most of the 
SGML theorists who do not work in production sites where multiple
versions of 
DTDs are used at different points of a process or procedure.  In other
words, 
they are an instrument of policy, not a policy.  Information is not
static 
where a high rate of change prevails.  A DTD is more like a control knot
in a 
NURB than a point in a B-spline.

My point is that for many information engineering problems, the approach 
Pierre took with Prototype has been taken by others and successfully.  
The arbiter of success is not the religion of the SGML Way, but the 
ability to meet the requirements of the task.  Bytes aren't holy.

As XSL/XML/XLL reach ever greater levels of design complexity in the 
base standards, a question emerging in other design groups (one heard 
before during the HyTime/DSSSL era) is:  Are these really complicated 
solutions looking for problems, not new and vital technologies?  Is 
there sudden rush of popularity based on the soundness of applicability, 
or the product of software company juggling of public perceptions?  

If simpler and more readily available and more easily understood 
technologies exist to solve a problem with an acceptable timeframe 
exist, the experienced engineer and the practical customer adopt 
them.  If not, they try the next best thing.  Is XML a *religion* 
of just the next best thing?

Len Bullard

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From dima at paragraph.com  Sun Feb  1 00:47:54 1998
From: dima at paragraph.com (Dmitri Kondratiev)
Date: Mon Jun  7 17:00:03 2004
Subject: SGML Architecture questions
Message-ID: <2.2.32.19980201004632.00719004@dream.paragraph.com>

I may be wrong, but from my understanding of SGML architecture, only
bridging mechanism provides for type extension. Everything else in
architecture seems to be element and attribute names remaping. Bridging
element serves as a target for mapping substructure to it. Still bridging
element is not defined in DTD and as a result its content/attributes can't
be validated by parser. Is that correct ?

Taking bridging example from "A Tutorial Introduction to SGML Architectures"
by W. Eliot Kimber, with architectural DTD :

<!-- Person Name and addresses architecture ("personarch")-->
<!ELEMENT person 
  (name,
   address?)
>
<!ELEMENT name 
   (#PCDATA | archbridge)*
>
<!ELEMENT address
   (#PCDATA | archbridge)*
>
<!ELEMENT archbridge
   (#PCDATA | archbridge)*
>

And mapping from elements in the document to elements in the architecture :

<?XML version="1.0" ?>
<?IS10744:arch name="personarch"
  bridge-form="archbridge"
?>
<!DOCTYPE customer.record [
 <!ATTLIST customer.record personarch NAME #FIXED "person" >
 <!ATTLIST  cust.name      personarch NAME #FIXED  "name"   >
 <!ATTLIST   last          personarch NAME #FIXED   "archbridge" >
 <!ATTLIST   first         personarch NAME #FIXED   "archbridge" >
 <!ATTLIST  cust.address   personarch NAME #FIXED  "address" >
 <!ATTLIST   street        personarch NAME #FIXED   "archbridge" >
 <!ATTLIST   city          personarch NAME #FIXED   "archbridge" >
 <!ATTLIST   state         personarch NAME #FIXED   "archbridge" >
 <!ATTLIST   zip           personarch NAME #FIXED   "archbridge" >
]>
<customer.record>
 <cust.name><last>Kimber</last><first>William</first></cust.name>
 <cust.address>
 <street>1234 Maple St.</street>
 <city>Austin</city><state>TX</state><zip>78757</zip>
 </cust.address>
</customer.record>

There is no DTD for <archbridge> element content so:

<last>Kimber</last><first>William</first>

could be :

<last>Kimber</last><middle>Eliot</middle><first>William</first>

So my question is :
how validity constraints can be enforced for bridging element substructure ?

Thanks,
Dima


-----------------
Dmitri Kondratiev
dima@paragraph.com
102401.2457@compuserve.com
http://www.geocities.com/SiliconValley/Lakes/3767/
tel: 07-095-464-9241


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cbullard at hiwaay.net  Sun Feb  1 01:01:24 1998
From: cbullard at hiwaay.net (len bullard)
Date: Mon Jun  7 17:00:03 2004
Subject: First experiences with XSL
References: <2.2.32.19980130155416.0085e27c@pop>
Message-ID: <34D3C820.2671@hiwaay.net>

Sharon Adler wrote:
> 
> Michael,
> 
> As I write this, the XSL WG is 2/3 through its first official meeting.  The
> Microsoft code does not represent the "Final" XSL but the srawman of some of
> the facilities of XSL.  The lack of diagnostics/limited functionality of a
> partial prototype implementation is not any indication of the functionality
> or capability of a style language, nor any final implementation. Of course
> you can accomplish what you wanted in Java.  Any hacker can do anything they
> want in code, but what about the rest of the world's humans.

Can anyone show that XSL (if indeed, a Turing complete language) is any
easier 
than Java?  XSL is a programmig language and there are far more mortals 
(programmers in some cases) who understand and can easily use Java than 
XSL/DSSSL.  Why?  Object-oriented programming is the rule 
not the exception in programming communities.  JavaScript has a
tremendous 
advantage in that stepping up to Java from JavaScript incurs no 
shocks of syntax.  It is an easy transition.

Since at least C forward, it has been the support libraries 
that made the difference in ease or utility because syntax aside, 
and side effect issues, the same features are found in most programming 
languages.  So, one might retreat to the defense of "But it is a
standard" 
and there one would have a point.  Unless and until Sun releases Java 
as a true standard (a PAS won't cut it), implementors of systems 
based on it create systems based on proprietary technology.

> Please don't use the XSL prototype if it is not suitable for you to play
> around with, but give us a chance to create a workable standard.

But of course.

len bullard

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jamsden at us.ibm.com  Sun Feb  1 01:07:30 1998
From: jamsden at us.ibm.com (Jim Amsden)
Date: Mon Jun  7 17:00:03 2004
Subject: XSL/XML/XLL and VRML (was: Re: Conditional actions in XS
Message-ID: <5040100014394115000002L052*@MHS>

Tony Stewart wrote:

>>I would have thought that a good DTD doesn't do this at all. The DTD
>>should define the information content, leaving both style and (IMO)
>>behavior to be specified in a stylesheet that is tailored to this
>>specific usage of the information.

More religion:
Information content should be subordinate to behavior, not the other way
around. The DTD defines the information structure required to support
(unfortunately) implied behavior which establishes the meaning of that data in
the context in which it was defined. Attributes establish characteristics which
maintain state supporting variant behavior. Contents and links represent
associations supporting additional state, and enabling collaborations with
other elements required to support behavior, including behavior of the document
as a whole. Of course, none of this has anything to do with rendering unless
that's the subject of the DTD. Note that if a language is rich enough, it
doesn't have to change just because the subject area changes. This might be the
basis of the appeal of XSL and XML-Data which both use XML (more or less) to
describe their subject areas.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From dcarlson at ontogenics.com  Sun Feb  1 02:16:05 1998
From: dcarlson at ontogenics.com (Dave Carlson)
Date: Mon Jun  7 17:00:03 2004
Subject: problems with emacs xml-mode
Message-ID: <2.2.32.19980201021007.00e40c30@pop.dimensional.com>

At 05:57 PM 1/31/98 -0500, David Megginson wrote:

> > 2.  The DTD is parsed, but all element names are folded into all lower case.
> > Does the current version of xml-mode support mixed-case element names?  If
> > so, what am I doing wrong?
>
>Are you certain that you're using the latest version of the patches
>(from Fall 1997) and that you're actually in XML rather than SGML
>mode?  Does it read 'XML' or 'SGML' in the mode bar at the bottom?

I'm using the xml-mode that I downloaded from your site in December 1997.
And, yes, it does read 'XML' in the mode bar.  I'll try some additional
testing to see if I can narrow down the problem.  Is there some other test I
can run to be sure I've got the entire xml- mode installed properly?  I had
to do some manual hacking to install on WinNT, maybe I messed up somewhere.

I've never gotten it to work correctly, but sometimes I get the top-level
element names in mixed case, and the content model all folded to lower case.
So, I can add mixed case elements at the top level, but there are no "valid"
sub-elements because the content model has all tags in lower case.  In
another test, everything was lower case.

> > 4.  Font highlighting has some problems.  I've configuring my _emacs file
> > according to earlier posts in this list, but the text  highlighing only
> > appears after I've used the context menu to insert a new tag.  Then, the
> > text is only highlighted from that point *backward* in the document.  When I
> > first load a document, no text is highlighted.
>
>Again, this is not directly related to the XML patches.  PSGML will
>highlight only the parts of the document that it has already parsed.
>In Unix, at least, it will eventually parse ahead and highlight the
>whole thing.
>
Yes, it will eventually highlight the entire document, once I've made an
addition to the end of the document.

Thanks for you help, and your contribution!

Dave


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From eliot at isogen.com  Sun Feb  1 13:57:12 1998
From: eliot at isogen.com (W. Eliot Kimber)
Date: Mon Jun  7 17:00:03 2004
Subject: SGML Architecture questions
Message-ID: <3.0.32.19980201074848.00c84c30@swbell.net>

At 03:46 AM 2/1/98 +0300, Dmitri Kondratiev wrote:
>I may be wrong, but from my understanding of SGML architecture, only
>bridging mechanism provides for type extension. Everything else in
>architecture seems to be element and attribute names remaping. Bridging
>element serves as a target for mapping substructure to it. Still bridging
>element is not defined in DTD and as a result its content/attributes can't
>be validated by parser. Is that correct ?

The bridging element *is* defined in the DTD, so it's use can be validated
by the parser, but your real question is:

>how validity constraints can be enforced for bridging element substructure ?

You do it locally in the document's own DTD, or you do it by deriving the
bridging element from another architecture. 


>There is no DTD for <archbridge> element content so:

Yes there is: (#PCDATA | archbridge)* 

However, you're point is that you might want to impose constraints on the
local (to this document) content of elements that map to archbridge.  You
could define, locally, the content for the name element to match your
constraints:

<?XML version="1.0" ?>
<?IS10744:arch name="personarch"
  bridge-form="archbridge"
?>
<!DOCTYPE customer.record [
 <!ATTLIST customer.record personarch NAME #FIXED "person" >
 <!ELEMENT  cust.name (last, first) ><!-- NOTE: local content model -->
 <!ATTLIST  cust.name      personarch NAME #FIXED  "name"   >
 <!ATTLIST   last          personarch NAME #FIXED   "archbridge" >
 <!ATTLIST   first         personarch NAME #FIXED   "archbridge" >
 <!ATTLIST  cust.address   personarch NAME #FIXED  "address" >
 <!ATTLIST   street        personarch NAME #FIXED   "archbridge" >
 <!ATTLIST   city          personarch NAME #FIXED   "archbridge" >
 <!ATTLIST   state         personarch NAME #FIXED   "archbridge" >
 <!ATTLIST   zip           personarch NAME #FIXED   "archbridge" >
]>
<customer.record>
 <cust.name><last>Kimber</last><first>William</first></cust.name>
 <cust.address>
 <street>1234 Maple St.</street>
 <city>Austin</city><state>TX</state><zip>78757</zip>
 </cust.address>
</customer.record>

You can also do it by deriving the bridging element from another architecture:

(This modifies the above declarations:)

<?IS10744:arch name="namearch"
   public-ID="Architecture for the rules for names of people and things"
?>
 <!ELEMENT  cust.name (last, first) >
 <!ATTLIST  cust.name      personarch NAME #FIXED  "name"   
                           namearch   NAME #FIXED  "person-name">
 <!ATTLIST   last          personarch NAME #FIXED  "archbridge" 
                           namearch   NAME #FIXED  "lastname">
 <!ATTLIST   first         personarch NAME #FIXED  "archbridge"
                           namearch   NAME #FIXED  "firstname">

This says that the cust.name element plays the role "name" within the
personarch architecture and the role "person-name" within the namearch
architecture. I can validate that cust.name satisfies the rules for "name"
as defined by the personarch and that its content satisfies the rules for
"person-name" in the namearch.

Notice how the cust.name element "bridges" from the personarch architecture
to the namearch architecture or from the architecture to the local
(document-specific rules).

Cheers,

Eliot
--
<Address HyTime=bibloc>
W. Eliot Kimber, Senior Consulting SGML Engineer
Highland Consulting, a division of ISOGEN International Corp.
2200 N. Lamar St., Suite 230, Dallas, TX 95202.  214.953.0004
www.isogen.com
</Address>

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cbullard at hiwaay.net  Sun Feb  1 17:16:07 1998
From: cbullard at hiwaay.net (len bullard)
Date: Mon Jun  7 17:00:03 2004
Subject: First experiences with XSL
References: <Pine.BSI.3.91.980131232552.6849A-100000@eccnet.eccnet.com>
Message-ID: <34D4AD72.49CE@hiwaay.net>

Betty Harvey wrote:
> 
> On Sat, 31 Jan 1998, len bullard wrote:
> 
> >
> > Can anyone show that XSL (if indeed, a Turing complete language) is any
> > easier
> > than Java?  XSL is a programmig language and there are far more mortals
> > (programmers in some cases) who understand and can easily use Java than
> > XSL/DSSSL.  Why?  Object-oriented programming is the rule
> > not the exception in programming communities.  JavaScript has a
> > tremendous
> > advantage in that stepping up to Java from JavaScript incurs no
> > shocks of syntax.  It is an easy transition.
> >
> 
> Len:
> 
>         My experience is it is XSL is easier.  I was able to
> take the XSL tutorial and create a simple example of an
> XSL stylesheet.
> 
>         If you have Microsoft Explorer 4.0 or higher you can test my first
> example at: http://www.eccnet.com/xmledi.
> 
>         My initial thoughts are that it doesn't do everything I
> want it to do - but I am going to hold judgement until the XSL
> standard becomes more stable.  Initially - I am impressed and
> looking forward to what XSL will offer us - thank goodness
> someone is not only thinking about style and behavior but
> moving towards a standard implementation effort - what
> FOSI tried to do 8 years ago.
> 
> Betty

That is good to hear.  Yet, the XSL/XLL discussion to me 
has the feel of attending a summer stock presentation of 
Hamlet:  famous lines all carefully memorized, spoken 
thousands of times before, and Hamlet still dies in the 
last scene.  Don't take it as a "I don't like XSL" but 
a cautionary, "we know our parts so well we can sleepwalk 
through them."  So yes, compelling examples are needed.  
The FOSI perished in complexity, HyTime has almost met 
the same fate, and DSSSL never got out of the gate before 
events and technology have overtook it. 

We have to meet the criticism that XML technology is a 
solution looking for a problem.  We need something better 
than the same defenses we presented for SGML/HyTime/DSSSL 
to the same criticism.

<crystalBall>
I sense a deflation in the enamouring of the Web.  Joe Q 
Public has discovered the anemia of the infrastructure.  
Still, experimental team efforts such as VRMLDream which 
will demonstrate a puppeteering technology for virtual 
theatre has promise.  For these applications, it is 1945 
and each TV network is a world unto itself.  These groups 
see the Internet as a broadcasting medium.   Maybe Clinton 
will survive his current problems and deliver on that 
"1000X the bandwidth" promise.  There is little doubt that 
replacing the Internet infrastructure is needed ASAP.

Business interest is stable, yet the groups who control 
the corporate standards are from printing backgrounds 
and marketing.  They see the Internet as a publishing 
medium.  They tend to be underwhelmingly technically 
talented, aversive to technology whose practicioners they
do not control, and able to restrict the application at 
the heart of the matter:  funding.  While the true 
practicioner seeks to expand capability, the purse stringers seek to 
restrict it and successfully.  It is necessary to look 
at the whole of the framework and how that can best meet 
business needs, in content developement, maintenance, 
production, and distribution.  The architectures must 
be sold accordingly.  (one rung up the CALS spiral).

Beware jargon; beware complex examples, 
beware precise description that fails to engender  
imaginative application.  The hook is the imagination. 
Sink the hook to reel in the fish.  Overall efficiency  
is becoming the primary issue given the size 
and bugginess of the framework.  Building  
evermore compelling and sustainable content is still 
the goal.  Just remember that many many groups do not 
believe that putting long lifecycle information assets on 
the WWW is a good thing to do.  Find out why. </crystalball>

best,

len

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From peter at ursus.demon.co.uk  Sun Feb  1 17:18:12 1998
From: peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun  7 17:00:03 2004
Subject: JUMBO9801a1 release
Message-ID: <3.0.1.16.19980201170326.1a4f992a@pop3.demon.co.uk>


An updated version of the alpha JUMBO distribution (hopefully with the
earlier bugs removed) is available at:
	http://www.vsms.nottingham.ac.uk/vsms/java/jumbo/jan9801/jumbo9801a1.zip

This should supersede the earlier version.

The JUMBO in this distribution now runs as an APPLET as well as the
application described previously and you are welcome to experiment. Since
applets require classes to be 'under' the codebase, I have not tested the
SAX-compliant parsers; experiments and feedback is welcome. Note that some
of the text fields are no longer included in the distribution and should be
downloaded from the appropriate sites

As before I welcome gross errors (e.g. it doesn't run). 

	P.

Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic
net connection
VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary
http://www.venus.co.uk/vhg

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ak117 at freenet.carleton.ca  Sun Feb  1 20:43:19 1998
From: ak117 at freenet.carleton.ca (David Megginson)
Date: Mon Jun  7 17:00:03 2004
Subject: SAX: Parser Interface -- Summary of Change Requests
Message-ID: <199802012028.PAA00747@unready.microstar.com>

As promised, I will now begin to summarise the requested changes to
SAX before we put out a stable 1.0 version: over the next few days, I
will send out one message summarising the requested changes to each
interface or class.  For more information on SAX, see

  http://www.microstar.com/XML/SAX/


There have been only two changes proposed to the Parser interface,
both of which would be backwards-compatible with existing
implementations:

1) Allow SAX to work with an input stream as well as a URI.

2) Simplify handler chaining by adding get* methods for existing
   handlers.


Here are the change requests in detail, with my initial response at
the end of each one:


1) Allow SAX to work with an input stream as well as a URI.

   - Paul Pazandak <pazandak@OBJS.com>
   - Peter Murray-Rust <peter@ursus.demon.co.uk>
   - Don Park <donpark@quake.net>

   Currently, the Parser interface provides only the following method
   to initiate a parse:

     void parse (String publicId, String systemId)
       throws java.lang.Exception;

   Following this suggestion, there would be a new method

     void parse (String publicId, String systemId, InputStream input)
       throws java.lang.Exception;

   (It is still necessary to provide a system identifier for resolving
   relative URIs within the stream).  Note that the stream would be a
   byte stream, not a character stream -- characters might require
   more than one octet, depending on the encoding in use.  

   I can see the convenience of this method, and I plan to add
   something like this to AElfred when I have a chance.  For SAX,
   however -- which is meant to end up as a language- and
   system-independent API -- I am reluctant to hardcode assumptions
   about storage (and I don't know enough about IDL to know if there
   is a general representation for streams).  Paul Pazandak has also
   suggested allowing strings and buffers -- in this case, they would
   already be decoded into characters.

   Personally, I'm undecided, and would be interested in hearing the
   theoretical arguments for and against this suggestion.


2) Simplify handler chaining by adding get* methods for existing
   handlers.

   - Don Park <donpark@quake.net>

   Currently the Parser interface provides only setters for the
   various handlers:

     public void setEntityHandler (EntityHandler handler);
     public void setDocumentHandler (DocumentHandler handler);
     public void setErrorHandler (ErrorHandler handler);

   Following this suggestions, there would also be accessors:

     public EntityHandler getEntityHandler ();
     public DocumentHandler getDocumentHandler ();
     public ErrorHandler getErrorHandler ();

   An application could then retrieve the existing handler and
   implement a new one which invokes the old one under certain
   circumstances.

   This seems like a generally good idea (as will as a simple and
   backwards-compatible change), and I am willing to implement it.
   The only complication is that we'll have to define the default
   state -- is the parser always required to return a default handler
   if the user has not explicitly set one, or should it return null?


I look forward to your comments and suggestions.


All the best,


David

-- 
David Megginson                 ak117@freenet.carleton.ca
Microstar Software Ltd.         dmeggins@microstar.com
      http://home.sprynet.com/sprynet/dmeggins/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tyler at infinet.com  Sun Feb  1 21:42:32 1998
From: tyler at infinet.com (Tyler Baker)
Date: Mon Jun  7 17:00:03 2004
Subject: SAX: Parser Interface -- Summary of Change Requests
References: <199802012028.PAA00747@unready.microstar.com>
Message-ID: <34D4EE00.A1FCECF5@infinet.com>

> Here are the change requests in detail, with my initial response at
> the end of each one:
>
> 1) Allow SAX to work with an input stream as well as a URI.
>
>    - Paul Pazandak <pazandak@OBJS.com>
>    - Peter Murray-Rust <peter@ursus.demon.co.uk>
>    - Don Park <donpark@quake.net>
>
>    Currently, the Parser interface provides only the following method
>    to initiate a parse:
>
>      void parse (String publicId, String systemId)
>        throws java.lang.Exception;
>
>    Following this suggestion, there would be a new method
>
>      void parse (String publicId, String systemId, InputStream input)
>        throws java.lang.Exception;
>
>    (It is still necessary to provide a system identifier for resolving
>    relative URIs within the stream).  Note that the stream would be a
>    byte stream, not a character stream -- characters might require
>    more than one octet, depending on the encoding in use.

Well, what if the XML data is streamed from a database where a URL does not
matter so much.  If you look at what Oracle, Sybase, and Microsoft among others
are planning on doing with XML, then supporting this with SAX in the most
ubiquitous way will be very much necessary.  I think that if you want to make SAX
have any CORBA support or other language support down the line, it would be best
to negate any polymorphism in the API cause in CORBA for example, you cannot
redefine operations in IDL (methods in Java).

>    I can see the convenience of this method, and I plan to add
>    something like this to AElfred when I have a chance.  For SAX,
>    however -- which is meant to end up as a language- and
>    system-independent API -- I am reluctant to hardcode assumptions
>    about storage (and I don't know enough about IDL to know if there
>    is a general representation for streams).  Paul Pazandak has also
>    suggested allowing strings and buffers -- in this case, they would
>    already be decoded into characters.

Another idea (as far as implementation goes) is to have the parser simply be an
extension of java.io.FilterInputStream which takes an one or more Handler
interfaces as arguments (to delegate to), so that you can handle very large
streams of data.  In addition to overriding the necessary
java.io.FilterInputStream methods, you can also have methods like readDocument(),
readElement(), etc.  This would give people a lot more control over reading in
XML.  This approach of course is similiar to how URL Content in the java.net
package handles content.  But where I see this approach being most useful is in
transactions where you might only want to read in a limited amount of data
anyways and process only that or else in the case where XML content is always at
a fixed length (like in databases where you get null padding for string fields
which do not take up the assigned length).  With the current SAX implementation,
you have no real control at the IO level where it would help to skip content if
the application feels it is necessary.

>    Personally, I'm undecided, and would be interested in hearing the
>    theoretical arguments for and against this suggestion.
>
> 2) Simplify handler chaining by adding get* methods for existing
>    handlers.
>
>    - Don Park <donpark@quake.net>
>
>    Currently the Parser interface provides only setters for the
>    various handlers:
>
>      public void setEntityHandler (EntityHandler handler);
>      public void setDocumentHandler (DocumentHandler handler);
>      public void setErrorHandler (ErrorHandler handler);
>
>    Following this suggestions, there would also be accessors:
>
>      public EntityHandler getEntityHandler ();
>      public DocumentHandler getDocumentHandler ();
>      public ErrorHandler getErrorHandler ();
>
>    An application could then retrieve the existing handler and
>    implement a new one which invokes the old one under certain
>    circumstances.

Not sure exactly what the use of these get methods is for cause all the handlers
are useful is delegation anyways.  The only reason the get methods would be
useful is for casting the returned object to some other form.  Why anyone would
need to do this is beyond me as recasting this object back to something would be
sloppy implementation in the first place.

>    This seems like a generally good idea (as will as a simple and
>    backwards-compatible change), and I am willing to implement it.
>    The only complication is that we'll have to define the default
>    state -- is the parser always required to return a default handler
>    if the user has not explicitly set one, or should it return null?

The default handler could just be something which spits stuff out to stdout or
some other OutputStream in a manner similiar to how Aelfred's EventDemo does.

Tyler


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tyler at infinet.com  Sun Feb  1 22:36:19 1998
From: tyler at infinet.com (Tyler Baker)
Date: Mon Jun  7 17:00:03 2004
Subject: SAX: Parser Interface -- Summary of Change Requests
References: <199802012028.PAA00747@unready.microstar.com> <34D4EE00.A1FCECF5@infinet.com>
Message-ID: <34D4FA9E.DFB80BAA@infinet.com>

Tyler Baker wrote:

> >    I can see the convenience of this method, and I plan to add
> >    something like this to AElfred when I have a chance.  For SAX,
> >    however -- which is meant to end up as a language- and
> >    system-independent API -- I am reluctant to hardcode assumptions
> >    about storage (and I don't know enough about IDL to know if there
> >    is a general representation for streams).  Paul Pazandak has also
> >    suggested allowing strings and buffers -- in this case, they would
> >    already be decoded into characters.
>
> Another idea (as far as implementation goes) is to have the parser simply be an
> extension of java.io.FilterInputStream which takes an one or more Handler
> interfaces as arguments (to delegate to), so that you can handle very large
> streams of data.  In addition to overriding the necessary
> java.io.FilterInputStream methods, you can also have methods like readDocument(),
> readElement(), etc.  This would give people a lot more control over reading in
> XML.  This approach of course is similiar to how URL Content in the java.net
> package handles content.  But where I see this approach being most useful is in
> transactions where you might only want to read in a limited amount of data
> anyways and process only that or else in the case where XML content is always at
> a fixed length (like in databases where you get null padding for string fields
> which do not take up the assigned length).  With the current SAX implementation,
> you have no real control at the IO level where it would help to skip content if
> the application feels it is necessary.

One last thing I wanted to add to this which would be nice is if you had the Parser
be an extension of java.io.FilterInputStream or java.io.InputStream, would be for
being able to simple take a compressed XML file and unpack it all in one line of
code.

For example, you could create it all like this:

XMLInputStream xis = new XMLInputStream(new CompressedInputStream(in), handler);

where in, is any input stream (like file, URL, etc) and handler is one or more
handlers.

This I feel is much more flexible, since currently SAX only will accept content which
comes from a resolved URL as well as the fact that if you are going to have an
InputStream argument, you will need control over how it is handled.  In addition, you
might want to be able to register the handler right before actually handling the
content.  For example, if you get a systemID or publicID of some type (this would
currently occur with a doctype event in SAX), you would then want to register a
particular document handler with that type (which could be done nicely with a dynamic
class loading mechanism).  In this case, you might have a static method in the
XMLInputStream class which acts as a registry for handlers of various document types
that could be something no more complex than a hashtable of class names which are
indexed by systemID or publicID.  You could have this registry just be for documents,
or else it could even be more complex with a federated namespace of handlers for
elements.

Personally I would much rather write code that looks like this:

// Done when I initialize the program
java.util.Properties handlers = new java.util.Properties();
try {
  handlers.load(new FileInputStream("foo.txt"));
} catch (IOException) {
    e.printStackTrace();
}
XMLInputStream.registerHandlers(handlers);

// Then later do this
URL fooURL = new URL("http://www.foo.com/bar.xml");
XMLInputStream xis = new XMLInputStream(fooURL.openStream());

Or if you don't use any registry for document handlers, you could simply do something
like this

DocumentHandler bdh = new BarDocumentHandler();
// Assumes bar.xml is a document type "bdh" can handle
URL fooURL = new URL("http://www.foo.com/bar.xml");
XMLInputStream xis = new XMLInputStream(fooURL.openStream(), bdh);

Once you have the "xis" reference, then just call methods like "readDocument(Document
document)" which would read the document data into a Document object (Document would
be an interface).

Document document = new MSWord90Document();
try {
  xis.readDocument(document)
} catch (IOException e) {
     e.printStackTrace();
}

Personally I prefer the registry idea so you the application would know ahead of time
what to do for any XML file (handle it or else do some default handling).

Just some ideas before v1.0 of SAX in grinded in stone...

Tyler


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mrc at allette.com.au  Sun Feb  1 22:43:42 1998
From: mrc at allette.com.au (Marcus Carr)
Date: Mon Jun  7 17:00:03 2004
Subject: First experiences with XSL
References: <2.2.32.19980130155416.0085e27c@pop> <34D3C820.2671@hiwaay.net>
Message-ID: <34D4FA79.4BF7FA1F@allette.com.au>

len bullard wrote:

> Can anyone show that XSL (if indeed, a Turing complete language) is any easier
> than Java?  XSL is a programmig language and there are far more mortals
> (programmers in some cases) who understand and can easily use Java than
> XSL/DSSSL.

I live in hope of the day when I finally see a file come out of a word processor
as XML, preceded by a DTD and an XSL style sheet. Rather than just regard XSL as
programming language, I would like to see it used as a common application
formatting syntax, as was tried with RTF. Assuming the users are going to do
pretty much whatever they want to as far as tagging is concerned (either for
legacy data or ongoing), conversion from one DTD to another will always be far
easier than conversion from an unstructured document to a structured one. This is
particularly true when you consider in current conversions how much structure is
implied from formatting characteristics (although this would presumably be
substantially diminished with more structured documents). From the perspective of
conversion of data (perhaps from a somewhat sloppy creation model to a more
concise storage model), a parseable, reasonably regular stylesheet would seem to
have advantages over Java.

Also, it may ultimately be desirable to produce an XSL document from some source,
interface or language that suits your individual needs better, thus XSL again
behaves as an interchange format. I think this fits well with the spirit of
XML/SGML.

> So, one might retreat to the defense of "But it is a standard" and there one
> would have a point.

There are other reasons, but the one you give above is also difficult to go past
:-)


--
Regards

Marcus Carr                  email:  mrc@allette.com.au
_______________________________________________________________
Allette Systems (Australia)  email:  info@allette.com.au
Level 10, 91 York Street     www:    http://www.allette.com.au
Sydney 2000 NSW Australia    phone:  +61 2 9262 4777
                             fax:    +61 2 9262 4774
_______________________________________________________________


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at quake.net  Sun Feb  1 22:55:04 1998
From: donpark at quake.net (Don Park)
Date: Mon Jun  7 17:00:03 2004
Subject: Parser Interface -- Summary of Change Requests
Message-ID: <003b01bd2f63$c5b6c800$2ee044c6@donpark>

David,

>1) Allow SAX to work with an input stream as well as a URI.
...
>     void parse (String publicId, String systemId, InputStream input)
>       throws java.lang.Exception;
...

My suggestion would be to add following two methods to the EntityHandler
interface:

    public InputStream
getEntityByteStream (String systemID)
    throws Exception;

    public InputStream
getEntityCharStream (String systemID)
    throws Exception;

The parser implementation should invoke getEntityCharStream first to see if
the there is decoded data available.  If not, it should invoke
getEntityByteStream to get the raw data.

If both methods return null, then default URL based code is used.

>2) Simplify handler chaining by adding get* methods for existing
>   handlers.
...
>   This seems like a generally good idea (as will as a simple and
>   backwards-compatible change), and I am willing to implement it.
>   The only complication is that we'll have to define the default
>   state -- is the parser always required to return a default handler
>   if the user has not explicitly set one, or should it return null?


It would be up to the SAX implementation.  It might provide default
implementation depending on configuration.  For example, FooSaxDriver might
have setInputType() method which would install a default EntityHandler for
fetching XML document from a database.

BTW, You left out my other suggestion which was

>>>>>>>>>>>>>>>>>>>>>>>>
In addition, I would like to have following two methods added to the Parser
API for driver-specific operations:

    public Object getDriverProperty(String name);
    public Object setDriverProperty(String name, Object value);

Property names should be prefixed with some unique values to avoid confusing
other drivers.  Note that above methods can be invoked without knowing which
driver is actually being used.  For example:

    parser.setDriverProperty("SuperDriver.lowercaseElements", Boolean.TRUE);
    parser.setDriverProperty("HungryDriver.cacheSize", new Integer(100000));
<<<<<<<<<<<<<<<<<<<<<<<<

Above two methods allow driver-specific code without actually having to
import anything.

Regards,

Don Park


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at quake.net  Sun Feb  1 23:11:06 1998
From: donpark at quake.net (Don Park)
Date: Mon Jun  7 17:00:03 2004
Subject: SAX: Parser Interface -- Summary of Change Requests
Message-ID: <006601bd2f66$08ce9d50$2ee044c6@donpark>

>Not sure exactly what the use of these get methods is for cause all the
handlers
>are useful is delegation anyways.  The only reason the get methods would be
>useful is for casting the returned object to some other form.  Why anyone
would
>need to do this is beyond me as recasting this object back to something
would be
>sloppy implementation in the first place.


get methods are for chaining delegations possible as well as allowing the
drivers to provide more functional default handlers without worrying about
having them blasted out of the water just because the application wants to
override the handler.  It is beyond me as to why anyone would cast the
returned object to some other form whether such practice is sloppy or not.
Please enlighten me.

>The default handler could just be something which spits stuff out to stdout
or
>some other OutputStream in a manner similiar to how Aelfred's EventDemo
does.

I don't think customers will appreciate having stdout or whatever filling
screen or disk with SAX event messages.  Internet Explorer with java logging
enabled would cause a hiccup.

Don Park
http://www.quake.net/~donpark/index.html


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ak117 at freenet.carleton.ca  Mon Feb  2 20:55:07 1998
From: ak117 at freenet.carleton.ca (David Megginson)
Date: Mon Jun  7 17:00:03 2004
Subject: SAX: Parser Interface -- Summary of Change Requests
In-Reply-To: <34D4EE00.A1FCECF5@infinet.com>
References: <199802012028.PAA00747@unready.microstar.com>
	<34D4EE00.A1FCECF5@infinet.com>
Message-ID: <199802022050.PAA01517@unready.microstar.com>

Tyler Baker writes:

 [on reading XML from a stream rather than a URI]

 > Well, what if the XML data is streamed from a database where a URL
 > does not matter so much.  If you look at what Oracle, Sybase, and
 > Microsoft among others are planning on doing with XML, then
 > supporting this with SAX in the most ubiquitous way will be very
 > much necessary.  I think that if you want to make SAX have any
 > CORBA support or other language support down the line, it would be
 > best to negate any polymorphism in the API cause in CORBA for
 > example, you cannot redefine operations in IDL (methods in Java).

This is a good point, but there are complications.  Do these vendors
plan to use character streams or byte streams?

 > Another idea (as far as implementation goes) is to have the parser
 > simply be an extension of java.io.FilterInputStream which takes an
 > one or more Handler interfaces as arguments (to delegate to), so
 > that you can handle very large streams of data.

This sounds like an interesting idea for a parser implementation, but
since SAX is meant to work with many parsers in many languages, it is
probably too constraining as a general common interface.

 [on get* methods for handlers]

 > Not sure exactly what the use of these get methods is for cause all
 > the handlers are useful is delegation anyways.  The only reason the
 > get methods would be useful is for casting the returned object to
 > some other form.  Why anyone would need to do this is beyond me as
 > recasting this object back to something would be sloppy
 > implementation in the first place.

Delegation itself might be enough justification, though -- we'll have
to wait and see what others suggest.

 > The default handler could just be something which spits stuff out
 > to stdout or some other OutputStream in a manner similiar to how
 > Aelfred's EventDemo does.

It would probably be best for the default handler to produce no output
at all, so that other handlers delegating to it would not end up
creating bloated log files.


All the best, and thanks for the feedback,


David

-- 
David Megginson                 ak117@freenet.carleton.ca
Microstar Software Ltd.         dmeggins@microstar.com
      http://home.sprynet.com/sprynet/dmeggins/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ak117 at freenet.carleton.ca  Mon Feb  2 21:04:04 1998
From: ak117 at freenet.carleton.ca (David Megginson)
Date: Mon Jun  7 17:00:03 2004
Subject: Parser Interface -- Summary of Change Requests
In-Reply-To: <003b01bd2f63$c5b6c800$2ee044c6@donpark>
References: <003b01bd2f63$c5b6c800$2ee044c6@donpark>
Message-ID: <199802022059.PAA01592@unready.microstar.com>

Don Park writes:

 >     public InputStream
 > getEntityByteStream (String systemID)
 >     throws Exception;
 > 
 >     public InputStream
 > getEntityCharStream (String systemID)
 >     throws Exception;
 >
 > The parser implementation should invoke getEntityCharStream first to see if
 > the there is decoded data available.  If not, it should invoke
 > getEntityByteStream to get the raw data.
 > 
 > If both methods return null, then default URL based code is used.

I like the general idea, though there are implementation problems.
Many languages (including Java 1.0.2) have no concept of a character
stream at all, and in Java 1.1, you would have to use

  public Reader getEntityCharStream (String systemID)
    throws Exception;

 > >   This seems like a generally good idea (as will as a simple and
 > >   backwards-compatible change), and I am willing to implement it.
 > >   The only complication is that we'll have to define the default
 > >   state -- is the parser always required to return a default handler
 > >   if the user has not explicitly set one, or should it return null?
 > 
 > It would be up to the SAX implementation.  It might provide default
 > implementation depending on configuration.  For example, FooSaxDriver might
 > have setInputType() method which would install a default EntityHandler for
 > fetching XML document from a database.

This might make life a little trickier for programmers using SAX --
what do others think?


 > BTW, You left out my other suggestion which was
 > 
 > >>>>>>>>>>>>>>>>>>>>>>>>
 > In addition, I would like to have following two methods added to the Parser
 > API for driver-specific operations:
 > 
 >     public Object getDriverProperty(String name);
 >     public Object setDriverProperty(String name, Object value);
 > 
 > Property names should be prefixed with some unique values to avoid confusing
 > other drivers.  Note that above methods can be invoked without knowing which
 > driver is actually being used.  For example:
 > 
 >     parser.setDriverProperty("SuperDriver.lowercaseElements", Boolean.TRUE);
 >     parser.setDriverProperty("HungryDriver.cacheSize", new Integer(100000));
 > <<<<<<<<<<<<<<<<<<<<<<<<
 > 
 > Above two methods allow driver-specific code without actually having to
 > import anything.

Sorry about the omission.  I'd be interested in hearing other
reactions to this suggestion -- I'm worried that it would result in
SAX implementations that are non-conformant XML processors (as in your
first example), or that are incompatible with each other.  Remember
that SAX defines only a minimum level of compatibility among XML
processors.


All the best,


David

-- 
David Megginson                 ak117@freenet.carleton.ca
Microstar Software Ltd.         dmeggins@microstar.com
      http://home.sprynet.com/sprynet/dmeggins/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tyler at infinet.com  Mon Feb  2 21:52:23 1998
From: tyler at infinet.com (Tyler Baker)
Date: Mon Jun  7 17:00:03 2004
Subject: SAX: Parser Interface -- Summary of Change Requests
References: <199802012028.PAA00747@unready.microstar.com>
		<34D4EE00.A1FCECF5@infinet.com> <199802022050.PAA01517@unready.microstar.com>
Message-ID: <34D63FDE.2D234CFC@infinet.com>

David Megginson wrote:

> Tyler Baker writes:
>
>  [on reading XML from a stream rather than a URI]
>
>  > Well, what if the XML data is streamed from a database where a URL
>  > does not matter so much.  If you look at what Oracle, Sybase, and
>  > Microsoft among others are planning on doing with XML, then
>  > supporting this with SAX in the most ubiquitous way will be very
>  > much necessary.  I think that if you want to make SAX have any
>  > CORBA support or other language support down the line, it would be
>  > best to negate any polymorphism in the API cause in CORBA for
>  > example, you cannot redefine operations in IDL (methods in Java).
>
> This is a good point, but there are complications.  Do these vendors
> plan to use character streams or byte streams?

In CORBA IDL there is a string and a wstring type.  The wstring type maps to
Unicode in the IDL -> Java mapping.  You could define everything as wstring if
you wish as far as IDL is concerned.

>  > Another idea (as far as implementation goes) is to have the parser
>  > simply be an extension of java.io.FilterInputStream which takes an
>  > one or more Handler interfaces as arguments (to delegate to), so
>  > that you can handle very large streams of data.
>
> This sounds like an interesting idea for a parser implementation, but
> since SAX is meant to work with many parsers in many languages, it is
> probably too constraining as a general common interface.

Yah I only meant as for the implementation, but on another note, I think that the
Handler interfaces are by far and away the most important ones.  Really, if
Aelfred had an XMLInputStream which could be derived out of Parser either by
having the parser be an implementation of XMLInputStream itself, or else
assigning a parser stub to XMLInputStream which could be retrieved by calling,
Parser.getXMLInputStream().  Parser.parse() would just parse everything with no
control over IO, but with XMLInputStream you could have control at the IO level
Furthermore, having a handler registry of SAX Handler interfaces (or just
pointers to where the class implementations live) would be invaluable to the
particular application I am working on now.  I suggested having a static
registerHandler method in XMLInputStream, but you could add this to Parser
instead.  This way you could simply pass in XML data and the parser would look up
the appropriate handler implementation for that doctype and load it dynamically.
Otherwise, this needs to be done manually and can really bloat your code at the
application level since you will have to essentially have a large number of
if/else statements and register the appropriate handlers manually.  If this was
implemented in Aelfred or any other parser, you would already remove a huge
burden off of the application developers utilizing XML IMHO.

>  [on get* methods for handlers]
>
>  > Not sure exactly what the use of these get methods is for cause all
>  > the handlers are useful is delegation anyways.  The only reason the
>  > get methods would be useful is for casting the returned object to
>  > some other form.  Why anyone would need to do this is beyond me as
>  > recasting this object back to something would be sloppy
>  > implementation in the first place.
>
> Delegation itself might be enough justification, though -- we'll have
> to wait and see what others suggest.

I think it would be better to have an addDocumentHandler() instead of
setDocumentHandler() if you wish to do delegation.  This is an
Observer/Observable pattern that would work quite nicely.  You could have
multiple objects register interest in the parsing of the XML data and have the
events delivered to them appropriately.  You might even make all of this beans
compliant if you really want to.

>  > The default handler could just be something which spits stuff out
>  > to stdout or some other OutputStream in a manner similiar to how
>  > Aelfred's EventDemo does.
>
> It would probably be best for the default handler to produce no output
> at all, so that other handlers delegating to it would not end up
> creating bloated log files.

Yah, I kinda overlooked this.  I just thought it would be nice for debugging.  My
stupid (-:

Tyler


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From papresco at technologist.com  Mon Feb  2 22:23:17 1998
From: papresco at technologist.com (Paul Prescod)
Date: Mon Jun  7 17:00:03 2004
Subject: First experiences with XSL
In-Reply-To: <01bd2d90$7dc5d6a0$1e09e391@mhklaptop.bra01.icl.co.uk>
Message-ID: <Pine.SUN.3.91.980202172056.12903J-100000@itrc.uwaterloo.ca>

On Fri, 30 Jan 1998, Michael Kay wrote:

> I've downloaded MSXSL and used it to generate HTML for a couple of document
> types, successfully but with a certain amount of frustration caused by (a)
> lack of diagnostics when I got things wrong, and (b) limited functionality.
> 
> I've now implemented the same thing without XSL: I wrote an MSXML
> application in Java that does a recursive walk down the document tree and
> calls a registered "handler" class to process each element type. 

Yes, you can implement something XSLish without XSL. The point of XSL is 
that it is to be a standard: there will be multiple, interoperable 
browser and word processor implementations as well as dedicated XSL 
development tools and so forth.

 Paul Prescod


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From oshima at osa.sci.jri.co.jp  Tue Feb  3 05:30:09 1998
From: oshima at osa.sci.jri.co.jp (Tetsuya OSHIMA)
Date: Mon Jun  7 17:00:03 2004
Subject: No subject
Message-ID: <9802030238.AA13691@t111ws06>

# bye

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From M.H.Kay at eng.icl.co.uk  Tue Feb  3 10:58:02 1998
From: M.H.Kay at eng.icl.co.uk (Michael Kay)
Date: Mon Jun  7 17:00:03 2004
Subject: SAX: Parser Interface -- Summary of Change Requests
Message-ID: <01bd3092$b63188e0$1e09e391@mhklaptop.bra01.icl.co.uk>


>Tyler Baker writes:
>
> [on reading XML from a stream rather than a URI]
>
> > Well, what if the XML data is streamed from a database where a URL
> > does not matter so much...

This suggests an analogy with CGI. A URL is not the name of a document, it
is a request for a stream of data, and what we need is a style of URL (or
extended URL) that allows the application to say "please send your requests
for data to me and I will supply a stream in response".

>This is a good point, but there are complications.  Do these vendors
>plan to use character streams or byte streams?
>
I don't know the Java technicalities, but surely what we mean by a stream
here is something that supplies a sequence of Unicode characters. (Surely
it's
not the parser's job to turn bytes into characters?)

We should also ensure that the design makes certain special cases easy for
the application writer, e.g.:

a) the primary input source is a file in filestore. (Translating the
filename to a URL is error-prone and it would be better for the parser to do
it)

b) there is only one input source (e.g. a record containing XML read from a
database, with no DTD or other external entities), probably available
already in the application as the contents of a String.

regards, Mike Kay


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ak117 at freenet.carleton.ca  Tue Feb  3 12:00:17 1998
From: ak117 at freenet.carleton.ca (David Megginson)
Date: Mon Jun  7 17:00:03 2004
Subject: SAX: Parser Interface -- Summary of Change Requests
In-Reply-To: <01bd3092$b63188e0$1e09e391@mhklaptop.bra01.icl.co.uk>
References: <01bd3092$b63188e0$1e09e391@mhklaptop.bra01.icl.co.uk>
Message-ID: <199802031155.GAA00333@unready.microstar.com>

Michael Kay writes:

 > I don't know the Java technicalities, but surely what we mean by a stream
 > here is something that supplies a sequence of Unicode characters. (Surely
 > it's
 > not the parser's job to turn bytes into characters?)

That depends on the type of stream.  I would not want to force the
client to do encoding conversion for a stream that happened to be open
to a local file or an HTTP connection.

 > We should also ensure that the design makes certain special cases easy for
 > the application writer, e.g.:
 > 
 > a) the primary input source is a file in filestore. (Translating the
 > filename to a URL is error-prone and it would be better for the parser to do
 > it)
 > 
 > b) there is only one input source (e.g. a record containing XML read from a
 > database, with no DTD or other external entities), probably available
 > already in the application as the contents of a String.

It should be possible to read from a string, but it would not be safe
to assume that the string contains no DTD or external entities -- it
would always be necessary to supply a base URI as well.


All the best,


David

-- 
David Megginson                 ak117@freenet.carleton.ca
Microstar Software Ltd.         dmeggins@microstar.com
      http://home.sprynet.com/sprynet/dmeggins/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From thyde-smith at derwent.co.uk  Tue Feb  3 12:19:04 1998
From: thyde-smith at derwent.co.uk (thyde-smith@derwent.co.uk)
Date: Mon Jun  7 17:00:03 2004
Subject: UNSUBSCRIBE
Message-ID: <00010D05.1271@derwent.co.uk>

     
     unsubscribe

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From nav at metratech.com  Tue Feb  3 16:02:32 1998
From: nav at metratech.com (Navdip Bhachech)
Date: Mon Jun  7 17:00:03 2004
Subject: recommendations on currently available streaming XML toolkits?
Message-ID: <01BD3093.3C881940.nav@metratech.com>

there have been a few discussions on streaming issues in this list 
lately, so I thought I'd ask:
What are the recommended toolkits (currently available) that allow 
streaming XML, instead of a file based approach?

Nav
______________________________________________________________
Navdip Bhachech
MetraTech Corp
www.MetraTech.com
nav@metratech.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ht at cogsci.ed.ac.uk  Tue Feb  3 16:51:48 1998
From: ht at cogsci.ed.ac.uk (Henry S. Thompson)
Date: Mon Jun  7 17:00:04 2004
Subject: recommendations on currently available streaming XML toolkits?
In-Reply-To: Navdip Bhachech's message of Tue, 3 Feb 1998 11:02:35 -0500
References: <01BD3093.3C881940.nav@metratech.com>
Message-ID: <f5bpvl46392.fsf@cogsci.ed.ac.uk>

Our XML tools are designed for streaming, and are happy with multi-10M
documents:

  http://www.ltg.ed.ac.uk/software/xml/

ht
-- 
Henry S. Thompson, Human Communication Research Centre, University of Edinburgh
      2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440
               Fax: (44) 131 650-4587, e-mail: ht@cogsci.ed.ac.uk  
                      URL: http://www.cogsci.ed.ac.uk/~ht/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ricko at allette.com.au  Tue Feb  3 22:29:04 1998
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun  7 17:00:04 2004
Subject: Ideas about Cutting and Pasting in XML 
Message-ID: <199802032238.JAA18643@jawa.chilli.net.au>

Developers with an idle moment may be interested in a paper I've
just put up "A Cut and Paste Infrastructure for XML"

	http://www.chilli.net.au/~ricko/XML-cut-n-paste.htm

It gives a direction I suggest XML needs to be developed towards,
in order to support arbitrary cutting and pasting between XML
documents.

This now has some comments about RDF (and XML-data) which may 
be of interest too.

Comments welcome.


Rick Jelliffe

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From bsteele at tdiinc.com  Tue Feb  3 22:36:23 1998
From: bsteele at tdiinc.com (Bob Steele)
Date: Mon Jun  7 17:00:04 2004
Subject: XML-Data: A naive question
Message-ID: <34D79CFD.4F0469F@tdiinc.com>

RDF documentation (Resource Description Framework (RDF) Model and
Syntax) states:

"RDF uses the Extensible Markup Language (XML) encoding as its syntax.
However, RDF will not require (and conforming implementations must not
require) an XML Document Type Declaration for the contents of
assertions. In this respect RDF requires at most the XML well-formedness
constraints. RDF schemas may ? but are not required to ? be XML DTDs."

Isn't this true of XML-Data?  I can't seem to find it expressly stated.

Thanks,

bob

--
<!--
Bob Steele, TDI Inc.
5000 Old Ironsides Drive
Santa Clara, CA 95054, USA
bsteele@tdiinc.com  http://www.tdiinc.com
Tel: 1-408-330-3404 , or toll free 1-888-544-5511
Fax: 1-408-330-3444
-->


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at quake.net  Tue Feb  3 23:07:45 1998
From: donpark at quake.net (Don Park)
Date: Mon Jun  7 17:00:04 2004
Subject: XML Conformance and DTD support in SAX
Message-ID: <003e01bd30f7$e0e6a830$2ee044c6@donpark>

1. XML Conformance

I am not sure if I am going off in a tangent but I think some form of markup
to indicate XML conformance would be really nice so that XML clients and
servers can decide whether to validate or not.

<?xml:conformance what="valid" when="02/03/98:24:12:00"
who="com.finicky.Validator">

2. It would be nice to have SAX provide more DTD information.

We could either have a separate DocumentTypeHandler or fire XML parsing
events for DTD as if it was an XML document being parsed.  Anyway, without
better support for DTD, DOM can be supported fully by SAX.  Perhaps we need
SAXDTD API to augment SAX?

No lines drawn, just digging some sand with my toes,

Don Park
http://www.quake.net/~donpark/index.html


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ak117 at freenet.carleton.ca  Wed Feb  4 00:35:18 1998
From: ak117 at freenet.carleton.ca (David Megginson)
Date: Mon Jun  7 17:00:04 2004
Subject: XML Conformance and DTD support in SAX
In-Reply-To: <003e01bd30f7$e0e6a830$2ee044c6@donpark>
References: <003e01bd30f7$e0e6a830$2ee044c6@donpark>
Message-ID: <199802040029.TAA00528@unready.microstar.com>

Don Park writes:

 > 2. It would be nice to have SAX provide more DTD information.
 > 
 > We could either have a separate DocumentTypeHandler or fire XML parsing
 > events for DTD as if it was an XML document being parsed.  Anyway, without
 > better support for DTD, DOM can be supported fully by SAX.  Perhaps we need
 > SAXDTD API to augment SAX?

I think that it is very likely that we will make a SAX level two some
other day, which might include a DocumentHandler and/or a DTDHandler
interface.  For now, however, we should probably try to stabilise what
we have -- the current SAX falls mostly within the range of features
already offered by existing parsers.


All the best,


David

-- 
David Megginson                 ak117@freenet.carleton.ca
Microstar Software Ltd.         dmeggins@microstar.com
      http://home.sprynet.com/sprynet/dmeggins/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From papresco at technologist.com  Wed Feb  4 08:50:48 1998
From: papresco at technologist.com (Paul Prescod)
Date: Mon Jun  7 17:00:04 2004
Subject: Namespaces, modules and architectures paper available
Message-ID: <34D82C2E.6C6B3AE7@technologist.com>

http://itrc.uwaterloo.ca/~papresco/sgml/namespaces.html

Why We Need Namespaces (Modules)
An SGML/XML Feature Proposal

Abstract

The World Wide Web Consortium has recently published a note called
Namespaces in XML. Not everyone has access to it yet, but they will
soon. It proposes a simple convention for allowing instances to have
elements whose type names come from many different schemas. According to
that note:

"We envision applications of XML in which a document instance may
contain markup defined in multiple schemas. These schemas may have been
authored independently. One motivation for this is that writing good
schemas is hard, so it is beneficial to reuse parts from existing,
well-designed schemas. Another is the advantage of allowing search
engines or other tools to operate over a range of documents that vary in
many respects but use common names for common element types."

Advocates of ISO architectural forms ("archforms") have noticed that
these requirements are very similar to those for archforms and have
proposed archforms as a solution. They are correct that the basic
underlying problems are related, but the problems are not identical. We
need both archforms and namespaces. The two ideas are actually very
complementary. This note demonstrates why neither architectural forms
nor the current namespace proposal really solve the "namespace problem"
satisfactorily.

Background

I will use the document [1]'A Proposal to Introduce "Module" Structures
Into SGML' as an example of a modules proposal which includes not just a
convention for namespace combination, but a syntax for actually
combining SGML DTD fragments. These fragments are the only standardized
schema for either SGML or XML.

Architectural forms allow a "client document" to declare that certain
elements conform to an element type in a DTD other than the document's
DTD. For instance you could say that a particular element is both a LINK
element in the document's DTD and a HyTime CLINK element in the HyTime
architecture. It is essentially both things at once. You can either
declare a particular element as having an architectural element type (in
addition to its ordinary element type) or you can declare that all of
the elements of a particular type adhere to a particular architectural
element type. For instance you could say that a particular "human"
element conforms to the "animal" architectural element type (if the
human was, for example, a "party animal") or you could say that all
"dog" elements conform to the "animal" architectural element type.

The Rub

A particular element can also conform to multiple architectural element
types. For instance the afore mentioned human could conform to both the
"programmer" and the "party animal" architectural element types (no,
those are not logically exclusive). My claim is that this increased
generality is a powerful feature in many contexts, but makes things way
too complex in the simple case for architectural forms to be the most
basic namespace management facility in XML. SGML and SGML tools are
organized around the idea that each element conforms to one and only one
element type. We have not yet re-thought the SGML processing idea in
terms of the concept of multiple element types. 

For instance, the most common form of SGML processing is validation.
SGML uses DTDs to define constraints on SGML documents. According to the
Japanese proposal, validation could be accomplished less like this: 

<!DOCTYPE MATH.AND.HYPERLINKS [
<!MODULE MATH SYSTEM "math.module.dtd">
<!MODULE HY SYSTEM "hyperlinks.module.dtd">
<!ELEMENT MATH.AND.HYPERLINKS (#PCDATA|HY::LINK|MATH::FORMULA)>
]>

Imagine that math.module.dtd and hyperlinks.module.dtd are hundreds of
lines long. Imagine also that they both had an element called "SET" (for
"mathematical set" and "link set"). As far as I know, there is no way to
accomplish this namespace merging operation with anything close to the
same ease with architectural forms. Yes, I can do it, by copying
math.module.dtd and hyperlinks.module.dtd into my document type. I can
then manually fix up the namespace clashes like my "SET" element. But it
is this sort of duplication of code that the modules proposal was
explicity designed to avoid. In fact, that is it's reason for existing.
We can see, then, that architectural forms do not solve the problem that
the modules proposal was meant to solve. They do not automatically merge
namespaces.

Let me define some terms to clarify. A namespace is a mapping from names
to objects, such as element type names to element types (explicitly or
implicitly declared). A namespace merge is the construction of a
namespace from two others that preserve all of the elements from the
originals. Architectural forms provide access to multiple namespaces,
but they do not merge namespaces.

I suspect that some with a long background in SGML will be a little
baffled trying to understand why someone would want to do this. After
all, combining document types is typically difficult work performed by
experts, tested on teams of users, tweaked to perfection with element
names remapped to fit the terminology of the user community. Mixing and
matching DTD fragments in an ad hoc manner might not seem like a good
idea. But the fact is that we live in a brave new world. End users want
to take control of their own document types in many cases. They want to
mix and match DTD fragments and they are not willing to spend the amount
of effort that we professionals are. Good for them! They will make all
of our lives easier. In fact, when authors say that they want to "get
rid of" DTDs, what they typically mean is that they don't want to be
constrained by someone else's DTD and making their own is too difficult!
If we can make DTD maintenance easier, more people will use them. 

Perhaps it would be possible update SGML that validation does not depend
so deeply on each element having a single element type, so that content
models could be expressed that combined elements from different
architectures. If we did that, my complaint might go away. Architectures
might regain some of the validatory simplicity of the modules proposal.
But this would require a much more fundamental change to SGML than the
modules proposal would.

Stylesheets

I will use stylesheets as another example of processing. The three most
interesting stylesheet languages right now are DSSSL, XSL and CSS. Each
of those has as its central organizing construct a rule triggered on an
element type name in a context. DSSSL has a feature that would allow
querying on architecture, but the feature is optional and is not
supported, for instance, by James Clark's Jade. Even where the feature
is available, the architectural form-based version of a stylesheet is
much more complicated than the equivalent based on a "flat" namespace
(such as a stylesheet for tradition SGML or SGML augmented with the
modules proposal). I invite architectural forms advocates to prove me
wrong by providing their stylesheets.

Here is what a module-enhanced DSSSL might look like:

<module target="mathml.dsl">
<module target="hyperlinks.dsl">
(element MATH.AND.HYPERLINKS (process-children))

As you can see, this has just enough lines to include the relevant
stylesheet modules and provide rules for the new elements. What would
the equivalent archform code look like? With DSSSL as it exists, it
would look quite ugly and convoluted. With some enhanced DSSSL it might
look reasonable (just as some enhanced SGML might be able to have
content models that span architectures), but nobody has yet proposed
what such a DSSSL would look like (just as nobody has proposed the
enhanced SGML). I am open to suggestions... 

I do not believe that either the current XSL proposal or CSS would allow
architecture based processing at all. Once again, the idea that every
element has a single element type is a fundamental organizing principle
of these stylesheet languages. It is also an organizing principle of
most SGML editors, DTD editors and formatting and conversion tools I
have used. In fact, almost every SGML tool in the world operates under
that principle. The best tools will give you access to architectural
forms (through their architectural attributes), but they will typically
use the element type name as the major organizing feature of the
stylesheets. Archform centric processing is typically awkward if it is
possible at all. 

The one element, one elment type principle is also central to every
course in SGML I have ever taken and any book on it I have ever read.
Even the SGML Handbook says that every element has a particular element
type (a single, particular element type). 

The Argument From Usability

Imagine that you are a typical end user and have used archforms instead
of a namespace merging mechanism to combine DTD fragments. Now imagine
that you know that a particular element type name appears in both DTD
fragments. I think that most people would be very surprised to learn
that the way to associate this element with one or the other DTD is to
add an attribute. Because the generic identifier (the name in the
start-tag) usually establishes the element type, you would probably
expect to change the generic identifier to change the association. But
using architectural forms, you would actually rather have to add an
attribute that would essentially disassociate the element with one of
the element types: "I may have the same name as that element type, but
it isn't actually one of my element types." I think that this is a nasty
case of making the common, simple case of merging DTD fragments more
complicated in order to make life easier for those of us who have to
solve problems that may actually require the full generality of
architectural forms. Once again, I invite advocates to send me code
samples that demonstrate that this is simpler than I think. 

Who was it that said: "Make the easy things easy and the hard things
possible." Architectural forms make hard things possible, but when
misapplied to the namespace problem, they make easy things unnecessarily
hard. Le me be clear: architectural forms (or something like them) have
an important role to play in SGML systems. We absolutely need some form
of semantic inheritance mechanism. But they work best when they work in
the environment they were designed for: they are typically used as an
underlying basis of a DTD designed by a professional. The professional
DTD designer renames elements to avoid clashes. That individual is the
real solution to the "namespace problem" in most environments. In
environments where such a person exists, archforms are really, really
useful. They are not useful because they allow you to merge namespaces
(they don't). They are useful because they allow you to combine
semantics from different DTD fragments in powerful ways (but more or
less manually). I think that a modules/namespaces proposal would
acutally be very useful for building architectures from DTD fragments. I
also think that architectural forms would be very useful on the Web. Not
every use of XML on the web will be ad hoc. Some XML applications will
need the robust multi-level validation that architectural forms allow.
Think about e-commerce for example.

But many users will not need or want architectural forms. Most people
just need a simple way to combine fixed DTD fragments so that there are
no name clashes. The Japanese module proposal provides such a mechanism.
Presumably Web-centric DTD-replacement schema languages will provide
mechanisms like this also. If these sorts of things are made much easier
in these schema languages than they are in SGML DTD syntax, people will
just avoid SGML DTD syntax. This would be a big mistake for all
concerned. Let's please just fix SGML through a proposal like the one
submitted by the Japanese in 1996. Some modules proposal should be part
of the SGML revision. This would in no way preclude the wide deployment
of architectural forms as a solution to a different problem.

 Paul Prescod
--
http://itrc.uwaterloo.ca/~papresco

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From papresco at technologist.com  Wed Feb  4 14:36:48 1998
From: papresco at technologist.com (Paul Prescod)
Date: Mon Jun  7 17:00:04 2004
Subject: Namespaces, modules and architectures paper available
References: <34D82C2E.6C6B3AE7@technologist.com> <34ee5fa9.103270755@mail.alink.net>
Message-ID: <34D87D02.14BA4B9C@technologist.com>

I appreciate the simplicity of this [1]proposal, but want to check that
it is not too simple to get the job done.

How would you pass information into a module with this proposal? For
instance, I might want to include a table model, but might need to
specify the contents of the table's cell elements from the containing
DTD.

Also, it feels "nicer" to me to have the instance structure control
namespace lookup so that when I am in a MATH::FORMULA element, I can use
elements from the MATH module without qualification. This convention
could remove most or all qualification from a document instance and thus
make things simpler for authors. For instance:

<!DOCTYPE ...[
<!ENTITY math SYSTEM "math.mod" MODULE>%math;
]>
...
<INTEGRAL/> <!-- a really important part -->
<MATH> <!-- not ambiguous at this scope -->
<INTEGRAL/> <!-- a mathematical integral -->
</MATH>

I would like it if the containing element would control namescope
choice.

 Paul Prescod
--
http://itrc.uwaterloo.ca/~papresco

[1] It should appear here soon: 
http://www.lists.ic.ac.uk/archives/xml-dev/9802/index.html


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ak117 at freenet.carleton.ca  Wed Feb  4 15:07:03 1998
From: ak117 at freenet.carleton.ca (David Megginson)
Date: Mon Jun  7 17:00:04 2004
Subject: Namespaces, Architectural Forms, and Sub-Documents
In-Reply-To: <34D87D02.14BA4B9C@technologist.com>
References: <34D82C2E.6C6B3AE7@technologist.com>
	<34ee5fa9.103270755@mail.alink.net>
	<34D87D02.14BA4B9C@technologist.com>
Message-ID: <199802041506.KAA00956@unready.microstar.com>

It seems to me that when you want to embed large contiguous structures
from different document types in an XML document, each different
namespace should be its own sub-document, referenced as a binary
entity (or using whatever other mechanisms are available in XML-Link).

Good tools and protocols should make it possible to create, transmit,
and process compound documents as if they were single files.  This
will be necessary anyway for supporting multimedia.

Here are some general guidelines:

* Architectural forms are most suitable for applications where
  multiple inheritance is required, or where elements belonging to a
  different document type are scattered throughout a document.

* Sub-documents are most suitable for applications where all of the
  element belonging to a different document type are rooted in a
  single subtree. 

"namespace:gi" element type names are unsuitable for several reasons:

1) The complexity of namespaces is exposed to the author rather than
   hidden in the DTD (as it is, optionally, with architectural forms).

2) Multiple inheritance is not possible (X can be a kind of Y or a
   kind of Z, but not both).

3) Standard DTD-based validation is not possible, and it is more
   difficult to create DTD-driven authoring tools.

4) Both architectural forms and sub-documents can be fully supported
   under the existing spec by _both_ validating and non-validating XML
   parsers: no changes necessary.  Furthermore, they will also remain
   compatible with SGML tools.

Why are people worried about writing specs to solve a problem that
already has good, working, available solutions?


All the best,


David

-- 
David Megginson                 ak117@freenet.carleton.ca
Microstar Software Ltd.         dmeggins@microstar.com
      http://home.sprynet.com/sprynet/dmeggins/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From grk at arlut.utexas.edu  Wed Feb  4 15:59:35 1998
From: grk at arlut.utexas.edu (Glenn R. Kronschnabl)
Date: Mon Jun  7 17:00:04 2004
Subject: FORTRAN namelist input - remember?  Replace with XML!
Message-ID: <199802041559.JAA06936@mail-firewall.arlut.utexas.edu>

I want to use XML as a general input mechanism for scientific programs.  In 
the old days, say in FORTRAN, one used to use namelist input.  In C/C++, one 
usually wrote a custom driver.  I want to use XML because it appears to make 
sense.  I have started using SP - and want to build a tree that I can query 
(kind of like an xrdb interface) for my input parameters.  But, before I 
embark on this, I was wondering if 1) this makes sense, 2) someone surely has 
a simple tree builder/query interface to SP already that I can use so I don't 
have to write my own (none jumped out at me when I looked around).

Thanks.

Cheers,
Glenn                                  
--------------------
Glenn R. Kronschnabl
Applied Research Laboratories        | grk@arlut.utexas.edu (PGP/MIME ok)
The University of Texas at Austin    | http://www.arlut.utexas.edu/~grk
PO Box 8029, Austin, TX 78713-8029   | (Ph) 512.835.3642 (FAX) 512.835.3808
10,000 Burnet Road, Austin, TX 78758 | ... but an Aggie at heart!


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From crism at ora.com  Wed Feb  4 16:29:11 1998
From: crism at ora.com (Chris Maden)
Date: Mon Jun  7 17:00:04 2004
Subject: Namespaces, Architectural Forms, and Sub-Documents
In-Reply-To: <199802041506.KAA00956@unready.microstar.com> (message from David
	Megginson on Wed, 4 Feb 1998 10:06:58 -0500)
Message-ID: <199802041632.LAA14809@geode.ora.com>

[David Megginson]
> "namespace:gi" element type names are unsuitable for several reasons:

[...]

> Why are people worried about writing specs to solve a problem that
> already has good, working, available solutions?

The problem (as I see it) is not one of including pieces of existing
documents, nor of structural validation.  The main reason for
namespaces is semantic inheritance.  I want to write a scientific
research paper quickly.  HTML has the overall document structure and
components that I need; MathML has equations; CML has chemical
formul�.  I should be able to say that I'm using those things,
associate stylesheets, and have my browser know that <html:a> should
be styled with the "a" rule from the HTML stylesheet.

It should be *possible* to create a DTD to which such a document
complies, but I am not as interested in automatic validation of a
namespace document.  The interrelational issues are, I think, too
complex to solve; in the example above, I would need to change the
text-containing HTML elements' content models to include chemical and
mathematical markup, and maybe allow HTML markup in MathML theorems.
Pushing selected information into the content models is too ugly.

-Chris
-- 
<!NOTATION SGML.Geek PUBLIC "-//Anonymous//NOTATION SGML Geek//EN">
<!ENTITY crism PUBLIC "-//O'Reilly//NONSGML Christopher R. Maden//EN"
"<URL>http://www.oreilly.com/people/staff/crism/ <TEL>+1.617.499.7487
<USMAIL>90 Sherman Street, Cambridge, MA 02140 USA" NDATA SGML.Geek>

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ak117 at freenet.carleton.ca  Wed Feb  4 17:34:27 1998
From: ak117 at freenet.carleton.ca (David Megginson)
Date: Mon Jun  7 17:00:04 2004
Subject: Namespaces, Architectural Forms, and Sub-Documents
In-Reply-To: <199802041632.LAA14809@geode.ora.com>
References: <199802041506.KAA00956@unready.microstar.com>
	<199802041632.LAA14809@geode.ora.com>
Message-ID: <199802041733.MAA02120@unready.microstar.com>

Chris Maden writes:

 > The problem (as I see it) is not one of including pieces of existing
 > documents, nor of structural validation.  The main reason for
 > namespaces is semantic inheritance.  I want to write a scientific
 > research paper quickly.  HTML has the overall document structure and
 > components that I need; MathML has equations; CML has chemical
 > formul?.  I should be able to say that I'm using those things,
 > associate stylesheets, and have my browser know that <html:a> should
 > be styled with the "a" rule from the HTML stylesheet.

It seems to me simpler to create a compound document rather than to
try to force everything into a single XML document -- you can
reference another XML document the same way that you can include a
graphic or audio sequence.  Managing a lot of small objects directly
on the file system can be tricky, but it's trivial with proper tool
support (think of OLE under Windows, despite its warts)

 > It should be *possible* to create a DTD to which such a document
 > complies, but I am not as interested in automatic validation of a
 > namespace document.  The interrelational issues are, I think, too
 > complex to solve; in the example above, I would need to change the
 > text-containing HTML elements' content models to include chemical and
 > mathematical markup, and maybe allow HTML markup in MathML theorems.
 > Pushing selected information into the content models is too ugly.

Not at all -- you just need a single element type to hold references
to other XML documents.  You could even (though this is disgusting)
use

  <img src="equation1.xml">


All the best,


David

-- 
David Megginson                 ak117@freenet.carleton.ca
Microstar Software Ltd.         dmeggins@microstar.com
      http://home.sprynet.com/sprynet/dmeggins/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From dima at paragraph.com  Wed Feb  4 18:01:44 1998
From: dima at paragraph.com (Dmitri Kondratiev)
Date: Mon Jun  7 17:00:04 2004
Subject: [AElfred] Problem: '"' in CDATA attribute
Message-ID: <2.2.32.19980203180236.0095ec44@dream.paragraph.com>

AElfred distribution from 19980112.
Problem: 
com.microstar.xml.XmlProcessor.error() reports error when parsing attribute
declared in DTD as CDATA and containing '"' in its value, such as "#text".

On the other hand com.microstar.sax.AElfredDriver from the same 19980112
distribution handles attribute definition corectly and doesn' report such an
error.

Dima


---------------------------
dima@paragraph.com
102401.2457@compuserve.com
http://www.geocities.com/SiliconValley/Lakes/3767/
tel: 07-095-464-9241


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From papresco at technologist.com  Wed Feb  4 18:05:50 1998
From: papresco at technologist.com (Paul Prescod)
Date: Mon Jun  7 17:00:05 2004
Subject: Namespaces, Architectural Forms, and Sub-Documents
References: <34D82C2E.6C6B3AE7@technologist.com>
		<34ee5fa9.103270755@mail.alink.net>
		<34D87D02.14BA4B9C@technologist.com> <199802041506.KAA00956@unready.microstar.com>
Message-ID: <34D8AE13.610ABC07@technologist.com>

David Megginson wrote:
> 
> It seems to me that when you want to embed large contiguous structures
> from different document types in an XML document, each different
> namespace should be its own sub-document, referenced as a binary
> entity (or using whatever other mechanisms are available in XML-Link).
> 
> Good tools and protocols should make it possible to create, transmit,
> and process compound documents as if they were single files.  This
> will be necessary anyway for supporting multimedia.

*MAKE EASY THINGS EASY*

Making my five-line formula into a different document with a different
document type is *not easy*. It is a royal pain in the butt, which is
why almost nobody does it. I have seen the CALS table model merged with
dozens of DTDs and have never once seen someone take the opposite
approach of making CALS tables "subdocuments."

We can imagine a theoretical universe in which the tools are so good
that this is easy, but if we are imaginative in this way, we can paper
over any design flaw in SGML or XML with the claim that "the tools can
handle it." If XML or SGML were designed to be manipulated only through
tools, that would be acceptable. But they were not...they were designed
to be written in text editors and surprising enough, a huge number of
people do that.

> Here are some general guidelines:
> 
> * Architectural forms are most suitable for applications where
>   multiple inheritance is required, or where elements belonging to a
>   different document type are scattered throughout a document.

I agree with the former. I don't with the latter. A simple modules
proposal handles the latter nicely.

> * Sub-documents are most suitable for applications where all of the
>   element belonging to a different document type are rooted in a
>   single subtree.

Subdocuments have many problems including 
 * typing convenience (seperate files...yuck)
 * element type constrainability (how do I specify a SUBDOC root element
type in a content model?)
 * "content model communication" (how do I pass a %cell; content model
into my table subdoc)
 * modularity (subdocs must be declared at the top of the document, an
annoying non-local maintenance issue)
 * ID linkage (even for simple links I must use some more advanced
linking strategy)
 * semantics (i.e. SUBDOC has none...you need VALUEREF or something else
on top of subdoc)

That does not mean that they are never useful. There are some hard
problems where they are very useful. But for the *simple problem* of
embedding MATH in HTML (for example) they are overkill, as are
architectural forms. *KEEP SIMPLE THINGS SIMPLE*

> "namespace:gi" element type names are unsuitable for several reasons:
> 
> 1) The complexity of namespaces is exposed to the author rather than
>    hidden in the DTD (as it is, optionally, with architectural forms).

As my paper pointed out, we now live in a universe where the person
creating the DTD is often the author. You live in a world where people
pay you to hide things in DTDs. Most of the people on the Web don't have
a David Megginson or a Paul Prescod to do that for them. Their problems
are still real.

> 2) Multiple inheritance is not possible (X can be a kind of Y or a
>    kind of Z, but not both).

Many people do not want multiple inheritance and as my paper pointed
out, it makes some problems much more difficult to understand and solve.

> 3) Standard DTD-based validation is not possible, and it is more
>    difficult to create DTD-driven authoring tools.

I think you are totally wrong here. As a programmer, I could implement
modules in an SGML editor in MUCH less time than it would take me to
implement architectural forms.

> 4) Both architectural forms and sub-documents can be fully supported
>    under the existing spec by _both_ validating and non-validating XML
>    parsers: no changes necessary.  Furthermore, they will also remain
>    compatible with SGML tools.

That's great for today. But for tomorrow, ISO has already undertaken to
change SGML. Do you propose that they should not add anything to SGML
that is not compatible with existing tools? My position is that the very
point of a revision is to make things easier and more powerful and that
this is thus the perfect opportunity to make this common problem easier
to solve, even if it breaks some old tools.

> Why are people worried about writing specs to solve a problem that
> already has good, working, available solutions?

Because the good, working solutions are solutions to much harder
problems and make simple jobs needlessly difficult. 

 Paul "SIMPLE THINGS SIMPLE" Prescod
--
http://itrc.uwaterloo.ca/~papresco

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From papresco at technologist.com  Wed Feb  4 18:17:33 1998
From: papresco at technologist.com (Paul Prescod)
Date: Mon Jun  7 17:00:05 2004
Subject: Namespaces, Architectural Forms, and Sub-Documents
References: <199802041632.LAA14809@geode.ora.com>
Message-ID: <34D8B09A.BBE21DA9@technologist.com>

Chris Maden wrote:
> 
> [David Megginson]
> > "namespace:gi" element type names are unsuitable for several reasons:
> 
> [...]
> 
> > Why are people worried about writing specs to solve a problem that
> > already has good, working, available solutions?
> 
> The problem (as I see it) is not one of including pieces of existing
> documents, nor of structural validation.  The main reason for
> namespaces is semantic inheritance.  

Architectural forms give you that.

> I want to write a scientific research paper quickly.  

The key word here is *quickly*. Architectural forms don't give you that.

> It should be *possible* to create a DTD to which such a document
> complies, but I am not as interested in automatic validation of a
> namespace document.  The interrelational issues are, I think, too
> complex to solve; in the example above, I would need to change the
> text-containing HTML elements' content models to include chemical and
> mathematical markup, and maybe allow HTML markup in MathML theorems.
> Pushing selected information into the content models is too ugly.

These issues are not complex at all. 

They are all handled nicely by the Japanese proposal. In a "modular
world", HTML would become a module that takes parameters such as
"object-types", "character span types", "block types" and so forth. You
pass in "MathML::Formula" as an "object-type" and the HTML %figure-type;
entity gets updated to reflect it. The issue is only complex in the
example you site because HTML was not designed to be modular because
SGML does not have a concept of DTD modules.

Even so, this is already dirt-common in SGML applications that don't
even *have* modules. You define a parameter entity and include the
entity. 

"<!-- In order to use the CALS table model, various parameter entity
     declarations are required.  A brief description is as follows:

..."

The only extra thing we need from modules is the namespace management
that helps us to avoid name clashes and a way to sneak parameter
entities or element names into the contained namespace.

 Paul Prescod
--
http://itrc.uwaterloo.ca/~papresco

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From dima at paragraph.com  Wed Feb  4 18:25:10 1998
From: dima at paragraph.com (Dmitri Kondratiev)
Date: Mon Jun  7 17:00:05 2004
Subject: [AElfred] Problem: '"' in CDATA attribute
Message-ID: <2.2.32.19980203182555.0093d6e8@dream.paragraph.com>

I was wrong about the version where this situation happens - I have this in
the latest com.microstar.xml.XmlProcessor (v 1.67 from 1998/01/27) actually.

Sorry about confusion,
Dima

At 21:02 03.02.98 +0300, Dmitri Kondratiev wrote:
>AElfred distribution from 19980112.
>Problem: 
>com.microstar.xml.XmlProcessor.error() reports error when parsing attribute
>declared in DTD as CDATA and containing '"' in its value, such as "#text".
>
>On the other hand com.microstar.sax.AElfredDriver from the same 19980112
>distribution handles attribute definition corectly and doesn' report such an
>error.
>
>Dima
>
>
>---------------------------
>dima@paragraph.com
>102401.2457@compuserve.com
>http://www.geocities.com/SiliconValley/Lakes/3767/
>tel: 07-095-464-9241
>
>
>
>
>xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
>Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
>To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
>(un)subscribe xml-dev
>To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
>subscribe xml-dev-digest
>List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>
>
>


---------------------------
dima@paragraph.com
102401.2457@compuserve.com
http://www.geocities.com/SiliconValley/Lakes/3767/
tel: 07-095-464-9241


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From papresco at technologist.com  Wed Feb  4 18:26:29 1998
From: papresco at technologist.com (Paul Prescod)
Date: Mon Jun  7 17:00:05 2004
Subject: FORTRAN namelist input - remember?  Replace with XML!
References: <199802041559.JAA06936@mail-firewall.arlut.utexas.edu>
Message-ID: <34D8B2D2.E17063E9@technologist.com>

Glenn R. Kronschnabl wrote:
> 
> I want to use XML as a general input mechanism for scientific programs.  In
> the old days, say in FORTRAN, one used to use namelist input.  In C/C++, one
> usually wrote a custom driver.  I want to use XML because it appears to make
> sense.  I have started using SP - and want to build a tree that I can query
> (kind of like an xrdb interface) for my input parameters.  But, before I
> embark on this, I was wondering if 1) this makes sense, 2) someone surely has
> a simple tree builder/query interface to SP already that I can use so I don't
> have to write my own (none jumped out at me when I looked around).

Download the source distribution of Jade (http://www.jclark.com/jade)
and look at GroveNode.h and GroveBuilder.h .

 Paul Prescod
--
http://itrc.uwaterloo.ca/~papresco

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Wed Feb  4 18:44:20 1998
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:00:05 2004
Subject: recommendations on currently available streaming XML
  toolkits?
Message-ID: <3.0.32.19980204104017.00aad5dc@pop.intergate.bc.ca>

At 11:02 AM 03/02/98 -0500, Navdip Bhachech wrote:
>there have been a few discussions on streaming issues in this list 
>lately, so I thought I'd ask:
>What are the recommended toolkits (currently available) that allow 
>streaming XML, instead of a file based approach?

Lark (http://www.textuality.com/Lark/) is happy to read a stream.
But as others have pointed out, relative URLs can be a real
problem. -Tim

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mtbryan at sgml.u-net.com  Wed Feb  4 18:57:49 1998
From: mtbryan at sgml.u-net.com (Martin Bryan)
Date: Mon Jun  7 17:00:05 2004
Subject: Namespaces, Architectural Forms, and Sub-Documents
Message-ID: <01bd318f$59cf8ae0$LocalHost@sgml>


>It seems to me that when you want to embed large contiguous structures
>from different document types in an XML document, each different
>namespace should be its own sub-document, referenced as a binary
>entity (or using whatever other mechanisms are available in XML-Link).


I'm interested in the concept of using links to datasets that are to be
embedded into other documents as 'subdocument modules'. When the linked data
is embedded it becomes a part of the document tree (you want queries and
locations to count the embedded elements). But it could have namespace/model
clashes so you want each linked sequence treated as a module with its own
name space. (A subdocument is not possible as there is no guarantee the
linked material will form a valid/complete document.)
>
>Good tools and protocols should make it possible to create, transmit,
>and process compound documents as if they were single files.  This
>will be necessary anyway for supporting multimedia.

I'm not convinced that all compound documents will be supportable with a
single document structure. This seems especially true of multimedia sets
created using cut & paste type operations.

>Here are some general guidelines:
>
>* Architectural forms are most suitable for applications where
>  multiple inheritance is required, or where elements belonging to a
>  different document type are scattered throughout a document.
>
>* Sub-documents are most suitable for applications where all of the
>  element belonging to a different document type are rooted in a
>  single subtree.

Providing the subtree is complete: this is not necessarily true of spans
>
>"namespace:gi" element type names are unsuitable for several reasons:
>
>1) The complexity of namespaces is exposed to the author rather than
>   hidden in the DTD (as it is, optionally, with architectural forms).
>
>2) Multiple inheritance is not possible (X can be a kind of Y or a
>   kind of Z, but not both).
>
>3) Standard DTD-based validation is not possible, and it is more
>   difficult to create DTD-driven authoring tools.
>
>4) Both architectural forms and sub-documents can be fully supported
>   under the existing spec by _both_ validating and non-validating XML
>   parsers: no changes necessary.  Furthermore, they will also remain
>   compatible with SGML tools.
>
>Why are people worried about writing specs to solve a problem that
>already has good, working, available solutions?

Unfortunately subdocs are not supported in XML, or in many SGML tools. If
you look at Charles Goldfarb's proposals for module naming in the light of
subdocs it is interesting that the module name becomes the name of the
entity that is used to reference the subdoc in the referencing file. This
ensures that the subdocument module name is unique in every context, because
the entity calling it must be uniquely named.

Martin Bryan


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From papresco at technologist.com  Wed Feb  4 20:40:38 1998
From: papresco at technologist.com (Paul Prescod)
Date: Mon Jun  7 17:00:05 2004
Subject: Namespaces, modules and architectures paper available
References: <34D82C2E.6C6B3AE7@technologist.com> <34ee5fa9.103270755@mail.alink.net> <34D87D02.14BA4B9C@technologist.com> <34f27f23.111330158@mail.alink.net>
Message-ID: <34D8D215.A651BF61@technologist.com>

Charles F. Goldfarb wrote:
> 
> As part of the revision we are also considering scoping of declarations;
> possibly by allowing an internal subset for an element type declaration.
> 
> <!ELEMENT foo (some|model|or|other) [
> <!ENTITY % module1 "some location" MODULE>%module1;
> <!-- Module1 only exists within foo elements. -->
> <!ELEMENT bar (#PCDATA)>
> <!-- As do bar elements. -->
> ]>
> 
> As with parameter passing, scoping declarations, if desirable, will be desirable
> with or without modules.

After thinking this through, I am a little disturbed by the proposal
above. To me, it implies a deep-ish changes to the SGML processing model
that a module/namespace proposal does not. Consider that in a
module/namespace proposal, every element type has a single, fully
qualified name. Unqualified references are merely "short form
references"  (not to be confused with "short references") -- they are a
short form for the full thing. Going from an unqualified instance to a
fully-qualified one is a purely syntactic operation.

But I'm not sure how I would refer to elements in the scheme above.
Let's say I am writing a stylesheet. How do I differentiate betwen
[1]"FOO"s with "BAR" parentage and [2]elements conforming to the element
type "FOO" that can only exist in "BAR". 

[1]
<!ELEMENT BAR (FOO)>
<!ELEMENT FOO EMPTY>
...
<FOO/>
<FOO/>
<BAR><FOO/><BAR>

Here all FOOs refer to the same element type.

[2]
<!ELEMENT FOO EMPTY>
<!ELEMENT BAR (FOO)[
<!ELEMENT FOO EMPTY>
]>
...
<FOO/>
<FOO/>
<BAR><FOO/><BAR>

Here all FOOs refer to different element types.

To me, there is a subtle but important difference. A scoped namespaces
proposal makes SGML (more) context dependent at the *syntactic* level,
but a scoped declarations proposal makes it context dependent at the
*semantic* level. There exists no "context free" expansion. I don't yet
know if this will cause Bad Side Effects. But right now I can't yet
imagine many uses for this feature *other than* the kind of element type
namespace scoping that could be accomplished completely in a modules
proposal. 

If there are no other important uses for this feature then I would
rather stick with the more strictly syntactic module structure and leave
this contextual declaration stuff out. But maybe there are important
uses for this that I have not considered. 

Note that I can *totally* imagine why you would want to scope an entity
declaration or notation declaration to an element, but not to an element
type. I think that the former should be a high priority, but don't
really understand the need for the latter.

 Paul Prescod
--
http://itrc.uwaterloo.ca/~papresco

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ak117 at freenet.carleton.ca  Wed Feb  4 22:52:40 1998
From: ak117 at freenet.carleton.ca (David Megginson)
Date: Mon Jun  7 17:00:05 2004
Subject: Namespaces, Architectural Forms, and Sub-Documents
In-Reply-To: <34D8AE13.610ABC07@technologist.com>
References: <34D82C2E.6C6B3AE7@technologist.com>
	<34ee5fa9.103270755@mail.alink.net>
	<34D87D02.14BA4B9C@technologist.com>
	<199802041506.KAA00956@unready.microstar.com>
	<34D8AE13.610ABC07@technologist.com>
Message-ID: <199802042253.RAA00485@unready.microstar.com>

Paul Prescod writes:

 > *MAKE EASY THINGS EASY*
 > 
 > Making my five-line formula into a different document with a different
 > document type is *not easy*. It is a royal pain in the butt, which is
 > why almost nobody does it. I have seen the CALS table model merged with
 > dozens of DTDs and have never once seen someone take the opposite
 > approach of making CALS tables "subdocuments."

You have stated a good, general rule of thumb; in this case, however,
it is important to remember that a central component of simplicity is
consistency (by the way, I _have_ seen CALS tables as SGML
subdocuments, but one of my dreams in XML is never to hear the words
"CALS table model" again).

XML documents may (and perhaps, usually will) contain non-XML objects
such as wordprocessor documents, spreadsheets, MPEG clips, Java
applets, audio sequences, and many others -- to date, thankfully, no
one has proposed uuencoding any these and dumping them inline between
a start and and tag.  

Why should we treat an equation marked up in XML differently than an
equation marked up in Microsoft Word?  It seems easier (from a user's
perspective) to treat everything as objects, rather than defining one
special case.  Object-oriented programming has proven the value of
encapsulation, and the compound-document idiom is standard on millions
of desktops already, so we can hardly argue that subdocuments are an
unfamiliar approach.

I am a big fan of pragmatism on the implementation side, as people
might have noticed from my postings on the design of AElfred; on the
standards side, though, I wouldn't want to cripple a spec just to work
around a temporary problem that will have to be solved anyway for
non-XML objects.  SGML people will remember unfortunate features like
SHORTREF, DATATAG, and OMITTAG -- included a little over a decade ago,
likewise, for the sake of making things easy and working around
temporary deficiencies in the available tools.  XML is popular mainly
because it has finally banned all of these.

 > Subdocuments have many problems including 
 >  * typing convenience (seperate files...yuck)

(See comments above).

 >  * element type constrainability (how do I specify a SUBDOC root element
 > type in a content model?)

Use HyTime (just joking).  Seriously, I cannot see that this is a
worse case than not being able to use a DTD at all.  The general idea
of compound documents (Netscape with plug-ins, OLE documents, Andrew
documents, or otherwise) is that you can plug in any object -- I had
imagined that this was the goal of namespaces as well.  In XML you can
constrain the placement of pointers to external objects, at least.

 >  * "content model communication" (how do I pass a %cell; content model
 > into my table subdoc)

You're thinking of CALS here.  I'd suggest that we move away from the
older SGML model of heavily parameterised DTDs (as from heavily
#IFDEF'ed C header files): remember that one of the arguments for the
namespace model is to reuse stylesheets and other processing
specifications -- if a table model can vary its content unpredictably,
then you will not be able to reuse stylesheets anyway.  Again,
encapsulation is a big win, and it keeps things easy.

That said, if you _really_ need to pass a %cell; content model to a
subdocument, you can always include the same file of entity
declarations in both the parent and the child.  I'd recommend against
it, but it's possible if you want to do it.

 >  * modularity (subdocs must be declared at the top of the document, an
 > annoying non-local maintenance issue)

Only if you use an entity/notation mechanism.  You could just as
easily use a URL/MIME approach:

  <object url="formula1.xml" mime="text/xml"/>

The question of how to include external objects is a separate debate,
and subdocuments can swing easily from either vine.

 >  * ID linkage (even for simple links I must use some more advanced
 > linking strategy)

HREFs would work fine -- HTML people are already used to

  <a href="book.html#chapter3">

so we should have no confusion here.  Furthermore, you have the
advantage that your document's validity does not depend on its child
objects (this is very important for document management in large,
multi-author systems -- if subdocuments are atomic, then a change by
one author to a table, for example, will not make the containing
chapter invalid).  Again, as in programming, encapsulation will be a
big win in the medium term.

 >  * semantics (i.e. SUBDOC has none...you need VALUEREF or something else
 > on top of subdoc)

I expect that XLL will provide mechanisms for expressing the 'embed'
semantic.


All the best,


David

-- 
David Megginson                 ak117@freenet.carleton.ca
Microstar Software Ltd.         dmeggins@microstar.com
      http://home.sprynet.com/sprynet/dmeggins/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ak117 at freenet.carleton.ca  Wed Feb  4 22:57:22 1998
From: ak117 at freenet.carleton.ca (David Megginson)
Date: Mon Jun  7 17:00:05 2004
Subject: Namespaces, Architectural Forms, and Sub-Documents
In-Reply-To: <01bd318f$59cf8ae0$LocalHost@sgml>
References: <01bd318f$59cf8ae0$LocalHost@sgml>
Message-ID: <199802042257.RAA00504@unready.microstar.com>

Martin Bryan writes:

 > Unfortunately subdocs are not supported in XML, or in many SGML
 > tools.

Sorry for any confusion here -- I'm talking about subdocuments in
general, not about the SGML SUBDOC feature.  You can include a
subdocument using an NDATA entity, or simply by providing a URI in an
attribute value.  I'm certain that XLL will have something useful to
say here.


All the best,


David

-- 
David Megginson                 ak117@freenet.carleton.ca
Microstar Software Ltd.         dmeggins@microstar.com
      http://home.sprynet.com/sprynet/dmeggins/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From elm at arbortext.com  Thu Feb  5 00:16:44 1998
From: elm at arbortext.com (Eve L. Maler)
Date: Mon Jun  7 17:00:05 2004
Subject: Namespaces, Architectural Forms, and Sub-Documents
In-Reply-To: <98Feb4.175315est.18819@thicket.arbortext.com>
References: <34D8AE13.610ABC07@technologist.com>
 <34D82C2E.6C6B3AE7@technologist.com>
 <34ee5fa9.103270755@mail.alink.net>
 <34D87D02.14BA4B9C@technologist.com>
 <199802041506.KAA00956@unready.microstar.com>
 <34D8AE13.610ABC07@technologist.com>
Message-ID: <3.0.5.32.19980204191534.009a4bc0@village.doctools.com>

This exchange is fascinating.  One comment:

At 05:53 PM 2/4/98 -0500, David Megginson wrote:
>Paul Prescod writes:
> >  * "content model communication" (how do I pass a %cell; content model
> > into my table subdoc)
>
>You're thinking of CALS here.  I'd suggest that we move away from the
>older SGML model of heavily parameterised DTDs (as from heavily
>#IFDEF'ed C header files): remember that one of the arguments for the
>namespace model is to reuse stylesheets and other processing
>specifications -- if a table model can vary its content unpredictably,
>then you will not be able to reuse stylesheets anyway.  Again,
>encapsulation is a big win, and it keeps things easy.

I don't think the problem has anything to do with CALS.  In fact, until
SGML Open came along, it was pretty hard to use the CALS table model as a
module -- it was not designed with this use in mind, and its inflexibility
resulted in dozens or hundreds of DTDs recoding the whole thing just to
change a few features.

Table models, even if they're not CALS, are going to vary their content
unpredictably, because cells typically need to contain markup *inside* them
that is specific to the information domain *outside* the table structure;
they're surrounded coming and going.  (As an aside, I don't think this
means you can't reuse stylesheets; you just sequester the table geometry
stuff from the cell formatting and recode just a little bit of
element-in-context stylesheet code.)

Table cells are a common boundary case of namespace mixing from the text
world, and perhaps there are similar situations in the data world.  I think
that a black-box approach (subdocuments) would require way more overhead
than a unified-model approach in doing "content model communication."

	Eve

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From papresco at technologist.com  Thu Feb  5 00:18:04 1998
From: papresco at technologist.com (Paul Prescod)
Date: Mon Jun  7 17:00:05 2004
Subject: Namespaces, Architectural Forms, and Sub-Documents
References: <34D82C2E.6C6B3AE7@technologist.com>
		<34ee5fa9.103270755@mail.alink.net>
		<34D87D02.14BA4B9C@technologist.com>
		<199802041506.KAA00956@unready.microstar.com>
		<34D8AE13.610ABC07@technologist.com> <199802042253.RAA00485@unready.microstar.com>
Message-ID: <34D8FD5C.7445D1AE@technologist.com>

David Megginson wrote:
> 
> XML documents may (and perhaps, usually will) contain non-XML objects
> such as wordprocessor documents, spreadsheets, MPEG clips, Java
> applets, audio sequences, and many others -- to date, thankfully, no
> one has proposed uuencoding any these and dumping them inline between
> a start and and tag.

Maybe not on this mailing list, but come on over to "SGML-TOOLS"
(formerly LinuxDoc). :) :)
 
> Why should we treat an equation marked up in XML differently than an
> equation marked up in Microsoft Word?  It seems easier (from a user's
> perspective) to treat everything as objects, rather than defining one
> special case.  

We should treat them differently for two reasons:

#1. XML data is text, and thus makes a certain amount of "sense" inline.
If I embedded LaTeX in an XML document I would probably inline it,
rather than refer to it for the same reason. Word formuale are binary.

#2. XML has concepts such as validation and id-reference that depend on
data being logically inline.

#3. If we do not do this, I do not think that people will use subdocs.
They will probably just abandon validation or use XML-Data.

> Object-oriented programming has proven the value of
> encapsulation, and the compound-document idiom is standard on millions
> of desktops already, so we can hardly argue that subdocuments are an
> unfamiliar approach.

Not so. Word does not use externally embedded data by default. If you
create a table, formula or a graphic, it is inlined by default.
Typically you only externally link to a file if it already exists (e.g.
it has some meaning independent of this document). I think Microsoft
made the right choice there.
 
> I am a big fan of pragmatism on the implementation side, as people
> might have noticed from my postings on the design of AElfred; on the
> standards side, though, I wouldn't want to cripple a spec just to work
> around a temporary problem that will have to be solved anyway for
> non-XML objects.  

SGML is 12 years old. We are only marginally closer to having decent
tools that will manage this stuff for us. I personally have no faith
that they will arrive soon. I also think that we have 10 years of good
experience with what we need to guide our choices. Most major DTDs
incorporate ad hoc DTD modularity features. We know what they need to
make these features robust -- just namespace protection.

> SGML people will remember unfortunate features like
> SHORTREF, DATATAG, and OMITTAG -- included a little over a decade ago,
> likewise, for the sake of making things easy and working around
> temporary deficiencies in the available tools.  

Well, I still use two of those three features, so obviously the problems
with the tools have not sufficiently cleared up yet. It also isn't clear
to me if those features have helped or hurt SGML's propularity. OMITTAG
in particular is very widely used. Even HTML uses it.

>  >  * element type constrainability (how do I specify a SUBDOC root element
>  > type in a content model?)
> 
> Use HyTime (just joking).  Seriously, I cannot see that this is a
> worse case than not being able to use a DTD at all.  

It isn't. But in XML we do have DTDs and we want to use them for these
heterogenous (not "compound") document.

> The general idea
> of compound documents (Netscape with plug-ins, OLE documents, Andrew
> documents, or otherwise) is that you can plug in any object -- I had
> imagined that this was the goal of namespaces as well.  

I don't think so. In my paper I quoted from the XML Namespaces spec:

"We envision applications of XML in which a document instance may
contain markup defined in multiple schemas. These schemas may have been
authored independently. One motivation for this is that writing good
schemas is hard, so it is beneficial to reuse parts from existing,
well-designed schemas. Another is the advantage of allowing search
engines or other tools to operate over a range of documents that vary in
many respects but use common names for common element types. "

The goal of combining schemas is central to the concept.

> In XML you can
> constrain the placement of pointers to external objects, at least.

Cold comfort. :)
 
>  >  * "content model communication" (how do I pass a %cell; content model
>  > into my table subdoc)
> 
> You're thinking of CALS here.  I'd suggest that we move away from the
> older SGML model of heavily parameterised DTDs (as from heavily
> #IFDEF'ed C header files): remember that one of the arguments for the
> namespace model is to reuse stylesheets and other processing
> specifications -- if a table model can vary its content unpredictably,
> then you will not be able to reuse stylesheets anyway.  

The formatting for the contents of table cells and for the shape of the
table can be specified independently. In HTML, (for example) essentially
anything can go in a table cell. The table formatter just figures it
out. A good stylesheet language will provide quite a bit of independence
between construction rules. Yes, we may need some conventions for more
complex combinations (e.g. metadata formatting conventions), but most
things will "just work."

>  >  * ID linkage (even for simple links I must use some more advanced
>  > linking strategy)
> 
> HREFs would work fine -- HTML people are already used to
> 
>   <a href="book.html#chapter3">
> 
> so we should have no confusion here.  

>  >  * semantics (i.e. SUBDOC has none...you need VALUEREF or something else
>  > on top of subdoc)
> 
> I expect that XLL will provide mechanisms for expressing the 'embed'
> semantic.

Both of these proposals just add hassles to something that should be
simple.

> Furthermore, you have the
> advantage that your document's validity does not depend on its child
> objects (this is very important for document management in large,
> multi-author systems -- if subdocuments are atomic, then a change by
> one author to a table, for example, will not make the containing
> chapter invalid).  Again, as in programming, encapsulation will be a
> big win in the medium term.

Yes, there are occasions where this encapsulation is important and
useful. There are also times where it is not.

Let me put it this way: do you feel that the creators of DocBook, TEI
and HTML were mistaken by including table models rather than forcing
their users to use subdocs? If yes, then you have a very different idea
of usable DTD design than I do. If no, then I cannot understand why you
are opposed to making this process of including table models easier so
that you do not need people with brains the size of planets and a
serious commitment to DTD use to accomplish it.

All I am asking is to make this common DTD fragment combination idiom
simpler, more standard and more robust so that casual (and expert!)
users can whip up their own DTDs by combining fragments instead of
manually merging fragments, disambiguating names, adding architectural
forms etc. etc.

 Paul Prescod
--
http://itrc.uwaterloo.ca/~papresco


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From papresco at technologist.com  Thu Feb  5 00:33:35 1998
From: papresco at technologist.com (Paul Prescod)
Date: Mon Jun  7 17:00:05 2004
Subject: Namespaces, Architectural Forms, and Sub-Documents
References: <34D8AE13.610ABC07@technologist.com>
	 <34D82C2E.6C6B3AE7@technologist.com>
	 <34ee5fa9.103270755@mail.alink.net>
	 <34D87D02.14BA4B9C@technologist.com>
	 <199802041506.KAA00956@unready.microstar.com>
	 <34D8AE13.610ABC07@technologist.com> <3.0.5.32.19980204191534.009a4bc0@village.doctools.com>
Message-ID: <34D90941.4777966F@technologist.com>

Eve L. Maler wrote:
> 
> Table models, even if they're not CALS, are going to vary their content
> unpredictably, because cells typically need to contain markup *inside* them
> that is specific to the information domain *outside* the table structure;
> they're surrounded coming and going.  

There are many other situations where we have the same problem, but just
don't recognize it. Think about lists, bibliographies, cross references
and so forth. We shouldn't have to reinvent these for each DTD. There
are probably a short list of interesting parameterizations on them (for
most apps) and we should just include and use them (after specifying the
relevant parameterization options). Nobody has tried this (much) in the
past because module usage in SGML is just too painful. So only CALS
tables and a few other constructs are complex enough that the pain
involved in reinventing them outweighs the pain involved in using them
from a module. But if we massively reduce the pain in reusing element
declarations, we will probably see people reusing them a lot more.

That means that we need a convenient parameterization syntax and
namespace managment. Actual DTD fragment management would also be very
useful. Perhaps the Web can start to serve that role (for those that
can't afford full databases).

 Paul Prescod
--
http://itrc.uwaterloo.ca/~papresco


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ak117 at freenet.carleton.ca  Thu Feb  5 02:32:56 1998
From: ak117 at freenet.carleton.ca (David Megginson)
Date: Mon Jun  7 17:00:05 2004
Subject: Namespaces, Architectural Forms, and Sub-Documents
In-Reply-To: <34D8FD5C.7445D1AE@technologist.com>
References: <34D82C2E.6C6B3AE7@technologist.com>
	<34ee5fa9.103270755@mail.alink.net>
	<34D87D02.14BA4B9C@technologist.com>
	<199802041506.KAA00956@unready.microstar.com>
	<34D8AE13.610ABC07@technologist.com>
	<199802042253.RAA00485@unready.microstar.com>
	<34D8FD5C.7445D1AE@technologist.com>
Message-ID: <199802050233.VAA00341@unready.microstar.com>

Paul Prescod writes:

 > Not so. Word does not use externally embedded data by default. If
 > you create a table, formula or a graphic, it is inlined by default.
 > Typically you only externally link to a file if it already exists
 > (e.g.  it has some meaning independent of this document). I think
 > Microsoft made the right choice there.

Here, perhaps, there is some miscommunication between us.  As I
understand it (and I am by no means a Microsoft guru, or even a
regular user, so please read this with appropriate caution), all Word
documents are actually OLE compound objects -- in other words, they
consist of (possibly many) separate objects stored in the same
physical disk file; a simpler example of the same thing is Java's JAR
files.

For XML to work on the desktop rather than just on the server, it will
also need some kind of packaging standard -- a way for all of the
entities (XML and non-XML) that make up a document to be edited,
stored, and shipped together, but easily broken apart again when
necessary.  I'm suggesting that once such a standard exists, and once
there are tools to use it, including subdocuments in XML will be as
easy as (and hopefully, much less buggy than) including Excel
spreadsheets in Word documents.

 > Let me put it this way: do you feel that the creators of DocBook,
 > TEI and HTML were mistaken by including table models rather than
 > forcing their users to use subdocs?

Of course not.  Different DTDs will include different levels of base
markup, depending on their areas of application -- we're dealing only
with the case when people want to use structures not defined in the
DTD itself.


All the best,


David

-- 
David Megginson                 ak117@freenet.carleton.ca
Microstar Software Ltd.         dmeggins@microstar.com
      http://home.sprynet.com/sprynet/dmeggins/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From papresco at technologist.com  Thu Feb  5 03:49:30 1998
From: papresco at technologist.com (Paul Prescod)
Date: Mon Jun  7 17:00:05 2004
Subject: Namespaces, Architectural Forms, and Sub-Documents
References: <34D82C2E.6C6B3AE7@technologist.com>
		<34ee5fa9.103270755@mail.alink.net>
		<34D87D02.14BA4B9C@technologist.com>
		<199802041506.KAA00956@unready.microstar.com>
		<34D8AE13.610ABC07@technologist.com>
		<199802042253.RAA00485@unready.microstar.com>
		<34D8FD5C.7445D1AE@technologist.com> <199802050233.VAA00341@unready.microstar.com>
Message-ID: <34D934FA.742DD860@technologist.com>

David Megginson wrote:
> 
> For XML to work on the desktop rather than just on the server, it will
> also need some kind of packaging standard -- a way for all of the
> entities (XML and non-XML) that make up a document to be edited,
> stored, and shipped together, but easily broken apart again when
> necessary.  I'm suggesting that once such a standard exists, and once
> there are tools to use it, including subdocuments in XML will be as
> easy as (and hopefully, much less buggy than) including Excel
> spreadsheets in Word documents.

It is only easy to do this with Word because Word manages it for you. I
don't intend to change to a dedicated XML editor, do you?
 
>  > Let me put it this way: do you feel that the creators of DocBook,
>  > TEI and HTML were mistaken by including table models rather than
>  > forcing their users to use subdocs?
> 
> Of course not.  Different DTDs will include different levels of base
> markup, depending on their areas of application -- we're dealing only
> with the case when people want to use structures not defined in the
> DTD itself.

No, the question is *how do we construct DTDs*? Let me try that quote
again:

"We envision applications of XML in which a document instance may
contain markup defined in multiple schemas. These schemas may have been
authored independently. One motivation for this is that writing good
schemas is hard, so it is beneficial to reuse parts from existing,
well-designed schemas. Another is the advantage of allowing search
engines or other tools to operate over a range of documents that vary in
many respects but use common names for common element types. "

Let me emphasize: "writing schemas is hard, so it is beneficial to reuse
parts from existing schemas." The goal is thus to construct DTDs from
smaller ones. (e.g. HTML + CALS + MATHML or TEILITE + JAVA + XLL or ...)

 Paul Prescod
--
http://itrc.uwaterloo.ca/~papresco


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ht at cogsci.ed.ac.uk  Thu Feb  5 09:34:37 1998
From: ht at cogsci.ed.ac.uk (Henry S. Thompson)
Date: Mon Jun  7 17:00:05 2004
Subject: FORTRAN namelist input - remember?  Replace with XML!
In-Reply-To: "Glenn R. Kronschnabl"'s message of Wed, 04 Feb 1998 09:56:58 -0600
References: <199802041559.JAA06936@mail-firewall.arlut.utexas.edu>
Message-ID: <f5blnvq8kez.fsf@cogsci.ed.ac.uk>

Our XML tool suite provides an API for this for XML directly, without using
SP.  Our NSL tool suite does the same for full SGML, using SP.

 http://www.ltg.ed.ac.uk/software/xml/ and .../nsl/

ht
-- 
Henry S. Thompson, Human Communication Research Centre, University of Edinburgh
      2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440
               Fax: (44) 131 650-4587, e-mail: ht@cogsci.ed.ac.uk  
                      URL: http://www.cogsci.ed.ac.uk/~ht/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From serres-doug at usa.net  Thu Feb  5 11:56:51 1998
From: serres-doug at usa.net (Doug Serres)
Date: Mon Jun  7 17:00:05 2004
Subject: recommendations on currently available streaming XML
	  toolkits?
References: <3.0.32.19980204104017.00aad5dc@pop.intergate.bc.ca>
Message-ID: <34D9A901.A472FCD3@usa.net>

Tim Bray wrote:

> At 11:02 AM 03/02/98 -0500, Navdip Bhachech wrote:
> >there have been a few discussions on streaming issues in this list
> >lately, so I thought I'd ask:
> >What are the recommended toolkits (currently available) that allow
> >streaming XML, instead of a file based approach?
>
> Lark (http://www.textuality.com/Lark/) is happy to read a stream.
> But as others have pointed out, relative URLs can be a real
> problem. -Tim
>

I'm using MSXML (http://www.microsoft.com/xml/) for streaming too.

--
Doug Serres
Junior Developer - R&D
Andyne Computing Ltd.
e-mail: dserres@andyne.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From gmckenzi at JetForm.com  Thu Feb  5 13:47:06 1998
From: gmckenzi at JetForm.com (Gavin McKenzie)
Date: Mon Jun  7 17:00:05 2004
Subject: Foreign object inclusion WAS: Namespaces, Architectural Forms, and Sub-Documents
Message-ID: <c=CA%a=_%p=JetForm%l=ROSSINI-980205134207Z-12699@rossini.jetform.com>


David Megginson wrote:
> [snip]
> XML documents may (and perhaps, usually will) contain non-XML objects
> such as wordprocessor documents, spreadsheets, MPEG clips, Java
> applets, audio sequences, and many others -- to date, thankfully, no
> one has proposed uuencoding any these and dumping them inline between
> a start and and tag.
> [snip]

Am I to understand from this paragraph that there would be something
wrong with uuencoded or base64'd resources, like audio clips or even a
Java class, between a start and end tag?

I thought this would be a given.  Sure using XLL or simple url hrefs are
great, but many times the requirement is for a single file with all
resources literally included.

This is similar conceptually to the intent of MIME, and MHTML, and OLE
(at one time the E meant something -- embedding). Syntactically MIME
derived methods aren't nearly as nice as stuffing the resource between a
start and end tag.

Take a look at the Internet Open Trading Protocol
http://www.otp.org:8080/  It does this all over the place.

A packaging standard to encapsulate all of the resources in the same
file is nice, but why isn't legitimate to place them all inline?

Gavin.

========================================================
Gavin F. McKenzie           Vox:+1(613)230-3676 ext 5277
JetForm Corporation         Fax:+1(613)594-8886
http://www.jetform.com   mailto:gmckenzi@jetform.com
========================================================


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From peter at ursus.demon.co.uk  Thu Feb  5 13:50:11 1998
From: peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun  7 17:00:05 2004
Subject: Namespaces, etc.
In-Reply-To: <34D82C2E.6C6B3AE7@technologist.com>
Message-ID: <3.0.1.16.19980205102838.2e4712de@pop3.demon.co.uk>

At 03:51 04/02/98 -0500, [many people] wrote [about namespaces,
architectures, etc.]:

I don't want to stifle discussion on XML-DEV, but suggest some guidelines:

1. There is a public draft of the Namespaces paper now, I believe. [Could
someone please confirm this and give the location - I wouldn't like to
refer to a private document]. My understanding is that the W3C is actively
working on namespaces. For this reason I think it is appropriate that
proposals for other ways of developing namespaces (especially those which
require new syntax or semantics) be referred to the appropriate W3C body.
If you aren't a member, but have something to propose I would hope that
chairs will be sympathetic if you mail them.

A major problem with discussing current W3C activity on this list is that
most members/readers do not have up-to-date knowledge of the current W3C
discussions. This can make for confusion, and it would break
confidentiality for a W3C member to say "hang on, we are going down a
different line". The most reasonable thing to do is to discuss the last
public draft of a spec (especially its implementation or experience of
implementation :-) but NOT, IMO, to make suggestions for its revision.

2. I suggest that discussion is limited to *implementing* or *exploring*
the Namespace proposal. The XML spec refers (I think) to "namespace
experiments" and I think that this is the approach we should take - i.e.
discuss experiments with *this* namespace proposal. 

My own approach has been:
	- to create a private namespace experiment
	- to approach WG members to see if it broke confidentiality
	- to wait until the spec was public
	- to distribute it, and a short explanatory note, with the current JUMBO
release. (9801a1)

So, rather than discuss my very simple namespace experiment on this list
(since it has many demerits and will almost certainly be broken by future
namespace developments) you can get it and read it with the distribution.
Its sole merits are that it is actually implemented, works and does
something useful for my applications. If others see it as a way forward I'd
be interested.  I hope to release JUMBO-PLAY shortly and this will
optionally use the namespace proposal.

	P.


Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic
net connection
VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary
http://www.venus.co.uk/vhg

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From peter at ursus.demon.co.uk  Thu Feb  5 13:56:20 1998
From: peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun  7 17:00:05 2004
Subject: FORTRAN namelist input - remember?  Replace with XML!
In-Reply-To: <199802041559.JAA06936@mail-firewall.arlut.utexas.edu>
Message-ID: <3.0.1.16.19980205095841.2e471f34@pop3.demon.co.uk>

At 09:56 04/02/98 -0600, Glenn R. Kronschnabl wrote:
>I want to use XML as a general input mechanism for scientific programs.  In 

Great idea!  XML revolutionises program input and output. FORTRAN
programmers spend half their life with:
	Column 61 (I2) the number of optional cards describing the FOO.

This is an optional branch of a tree. With TEI processing it's marvellous.
I am trying to convert the molecular community to use XML as standard for
input and output to *existing* programs. If you can achieve it in your
community - great.

>the old days, say in FORTRAN, one used to use namelist input.  In C/C++, one 
>usually wrote a custom driver.  I want to use XML because it appears to make 
>sense.  I have started using SP - and want to build a tree that I can query 
>(kind of like an xrdb interface) for my input parameters.  But, before I 
>embark on this, I was wondering if 1) this makes sense, 2) someone surely
has 
>a simple tree builder/query interface to SP already that I can use so I
don't 
>have to write my own (none jumped out at me when I looked around).

I imagine the simplest way to do this is to write an XML2F77input
processor. This is really a stylesheet application.  If you wait for XSL I
suspect it will solve many of your problems. If you can't wait, then there
may be facilities in JUMBO that could be useful.

	P.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From peter at ursus.demon.co.uk  Thu Feb  5 14:07:30 1998
From: peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun  7 17:00:05 2004
Subject: LISTRIVIA: (was Re: Namespaces, modules and architectures
  paper available)
In-Reply-To: <34ee5fa9.103270755@mail.alink.net>
References: <34D82C2E.6C6B3AE7@technologist.com>
 <34D82C2E.6C6B3AE7@technologist.com>
Message-ID: <3.0.1.16.19980205134243.2e470aec@pop3.demon.co.uk>

At 12:46 04/02/98 GMT, Charles F. Goldfarb wrote:
>As several postings have referred to module proposals that are being
considered
>for the SGML revision, I thought it might be helpful to post one here.
>--
>Charles F. Goldfarb * Information Management Consulting * +1(408)867-5553
>           13075 Paramount Court * Saratoga CA 95070 * USA
>  International Standards Editor * ISO 8879 SGML * ISO/IEC 10744 HyTime
> Prentice-Hall Series Editor * CFG Series on Open Information Management
>--
>
>Attachment Converted: "c:\eudora\attach\module.htm"

Charles,
	We try to dissuade people from attachments to XML-DEV postings because:
	- some people cannot read them
	- they do not appear in the hypermailed version
	- there is no permanent record.
	- long attachments cost people (including me) money
	- they cannot be quoted easily

Could you please repost. If it's short, please include it; if not please
give a URL. 
	TIA

	P.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ak117 at freenet.carleton.ca  Thu Feb  5 14:09:49 1998
From: ak117 at freenet.carleton.ca (David Megginson)
Date: Mon Jun  7 17:00:05 2004
Subject: Foreign object inclusion WAS: Namespaces, Architectural Forms, and Sub-Documents
In-Reply-To: <c=CA%a=_%p=JetForm%l=ROSSINI-980205134207Z-12699@rossini.jetform.com>
References: <c=CA%a=_%p=JetForm%l=ROSSINI-980205134207Z-12699@rossini.jetform.com>
Message-ID: <199802051409.JAA00365@unready.microstar.com>

Gavin McKenzie writes:
 > 
 > David Megginson wrote:
 > > [snip]
 > > XML documents may (and perhaps, usually will) contain non-XML objects
 > > such as wordprocessor documents, spreadsheets, MPEG clips, Java
 > > applets, audio sequences, and many others -- to date, thankfully, no
 > > one has proposed uuencoding any these and dumping them inline between
 > > a start and and tag.
 > > [snip]
 > 
 > Am I to understand from this paragraph that there would be
 > something wrong with uuencoded or base64'd resources, like audio
 > clips or even a Java class, between a start and end tag?

You are quite right that this is legal XML or SGML -- that's one valid
use of NOTATION attributes. Here's this paragraph UUENCODED:

<object notation="uuencoded">
begin 644 para
M66]U(&%R92!Q=6ET92!R:6=H="!T:&%T('1H:7,@:7,@;&5G86P@6$U,(&]R
M(%-'34P@+2T@=&AA="=S(&]N92!V86QI9`IU<V4@;V8@3D]4051)3TX@871T
J<FEB=71E<RX@2&5R92=S('1H:7,@<&%R86=R87!H(%5514Y#3T1%1#H*
`
end
</object>

It reflects well on XML that this is possible.

 > I thought this would be a given.  Sure using XLL or simple url
 > hrefs are great, but many times the requirement is for a single
 > file with all resources literally included.

I don't see that there is any long-term advantage to that -- in the
short-term, it will work around some temporary short-comings in specs
and implementations, but it's the equivalent of writing an entire C
program in a single file to save time on linking (or even all in
main(), to avoid the overhead of subroutines).  Modularity and
encapsulation have already proven their worth in the programming
world, and they will prove their worth in XML as well.

In other words, inlining uuencoded objects is a kludge: by all means,
do it in your implementations if you plan to ship soon and need to
work with the current generation of software and Internet protocols,
but recognise that you are creating maintenance headaches for yourself
later on (as I have for myself by forcing AElfred into a single Java
class file), and **PLEASE** do not codify kludges in standards.


All the best,


David

-- 
David Megginson                 ak117@freenet.carleton.ca
Microstar Software Ltd.         dmeggins@microstar.com
      http://home.sprynet.com/sprynet/dmeggins/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ak117 at freenet.carleton.ca  Thu Feb  5 14:14:31 1998
From: ak117 at freenet.carleton.ca (David Megginson)
Date: Mon Jun  7 17:00:05 2004
Subject: Namespaces, etc.
In-Reply-To: <3.0.1.16.19980205102838.2e4712de@pop3.demon.co.uk>
References: <34D82C2E.6C6B3AE7@technologist.com>
	<3.0.1.16.19980205102838.2e4712de@pop3.demon.co.uk>
Message-ID: <199802051413.JAA00386@unready.microstar.com>

Peter Murray-Rust writes:

 > 2. I suggest that discussion is limited to *implementing* or *exploring*
 > the Namespace proposal. The XML spec refers (I think) to "namespace
 > experiments" and I think that this is the approach we should take - i.e.
 > discuss experiments with *this* namespace proposal. 

I think to this point we have met the first part of this guideline at
least -- the discussion has focussed very closely on implementation
issues, and as implementors we have been discussing general approaches
broadly (i.e. namespaces, architectural forms, and subdocuments)
rather than dealing with details of a specific proposal.

In fact, architectural forms and subdocuments do not require any
proposal at all -- the already exist, and can be used with the current
XML spec.


All the best,


David

-- 
David Megginson                 ak117@freenet.carleton.ca
Microstar Software Ltd.         dmeggins@microstar.com
      http://home.sprynet.com/sprynet/dmeggins/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From gmckenzi at JetForm.com  Thu Feb  5 14:54:41 1998
From: gmckenzi at JetForm.com (Gavin McKenzie)
Date: Mon Jun  7 17:00:05 2004
Subject: Foreign object inclusion WAS: Namespaces, Architectural Forms, and Sub-Documents
Message-ID: <c=CA%a=_%p=JetForm%l=ROSSINI-980205144946Z-13131@rossini.jetform.com>


David Megginson wrote:
> Gavin McKenzie wrote:
> > 
> > I thought this would be a given.  Sure using XLL or simple url
> > hrefs are great, but many times the requirement is for a single
> > file with all resources literally included.
> 
> I don't see that there is any long-term advantage to that -- in the
> short-term, it will work around some temporary short-comings in specs
> and implementations...[snip]...
> 
> In other words, inlining uuencoded objects is a kludge:
> ...[snip]...
> ...recognise that you are creating maintenance headaches for yourself
> later on (as I have for myself by forcing AElfred into a single Java
> class file), and **PLEASE** do not codify kludges in standards.

This is *NOT* a kludge.  Take archiving applications for instance.
Ideally you want a single file that literally includes all of the
resources that were part of the original document.  No external
linkages.

If you go the MHTML route, which is really just extended MIME, it does a
pretty good theoretical job of this.  The resources are all contained in
one file, and the interlinks between the resources are fixed up so that
they can refer to each other.  Any interlinks that aren't resolved
inside the file can redirect out to the net.  In fact these interlinks
can't really be resolved by the MIME processor, because it is possible
that a linkage may occur inside a script embedded in a resource that the
MIME processor knows nothing about.

If everything were in-situ XML, then it is already one file, easier to
archive, and I can come up with conventions for interlinks easily.

So....methinks this is not a kludge, rather a necessary, legitimate, and
sometimes desirable thing to do.

Gavin.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Jon.Bosak at eng.Sun.COM  Thu Feb  5 17:47:03 1998
From: Jon.Bosak at eng.Sun.COM (Jon Bosak)
Date: Mon Jun  7 17:00:05 2004
Subject: Namespaces, etc.
In-Reply-To: <3.0.1.16.19980205102838.2e4712de@pop3.demon.co.uk> (message from Peter Murray-Rust on Thu, 05 Feb 1998 10:28:38)
Message-ID: <199802051744.JAA19703@boethius.eng.sun.com>

(I am replying separately to both lists because the message was
cross-posted.  PLEASE DO NOT CROSS-POST BETWEEN THE W3C-XML-SIG LIST
AND THE XML-DEV LIST.)

[Peter Murray-Rust:]

| There is a public draft of the Namespaces paper now, I believe. [Could
| someone please confirm this and give the location - I wouldn't like to
| refer to a private document].

The Note on name spaces has been on the W3C site for several days, but
for some reason wasn't visible from the TR page.  That's been fixed
now, and you can get the Note at

   http://www.w3.org/TR/1998/NOTE-xml-names

The document is public.

Jon


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mrc at allette.com.au  Thu Feb  5 21:24:15 1998
From: mrc at allette.com.au (Marcus Carr)
Date: Mon Jun  7 17:00:06 2004
Subject: Foreign object inclusion WAS: Namespaces, Architectural Forms, and Sub-Documents
References: <c=CA%a=_%p=JetForm%l=ROSSINI-980205134207Z-12699@rossini.jetform.com> <199802051409.JAA00365@unready.microstar.com>
Message-ID: <34DA2DDA.5C725B41@allette.com.au>

David Megginson wrote:

> You are quite right that this is legal XML or SGML -- that's one valid use of
> NOTATION attributes. Here's this paragraph UUENCODED:
>
> <object notation="uuencoded">
> begin 644 para
> M66]U(&%R92!Q=6ET92!R:6=H="!T:&%T('1H:7,@:7,@;&5G86P@6$U,(&]R
> M(%-'34P@+2T@=&AA="=S(&]N92!V86QI9`IU<V4@;V8@3D]4051)3TX@871T
> J<FEB=71E<RX@2&5R92=S('1H:7,@<&%R86=R87!H(%5514Y#3T1%1#H*
> `
> end
> </object>

The real problem with included fragments (as I see it) is the fact that you need
to understand the impact of the embedded fragment on structure. An SGML parser
would try to fire the elements <V4>, <FEB> and <RX> and expand the entity &AA in
the above. Even if the element were declared as CDATA, the sequence "</ " [any
name character] would delimit the <object> element.

> In other words, inlining uuencoded objects is a kludge...

Unless you plan to write an application to confirm that your embedded fragments
aren't detrimental to your structure, I would advise against this. Even if the
fragment wasn't detrimental to your structure, it may be to someone who wants to
reuse a chunk of your data, adding a dangerous level of uncertainty to your
documents.

--
Regards

Marcus Carr                  email:  mrc@allette.com.au
_______________________________________________________________
Allette Systems (Australia)  email:  info@allette.com.au
Level 10, 91 York Street     www:    http://www.allette.com.au
Sydney 2000 NSW Australia    phone:  +61 2 9262 4777
                             fax:    +61 2 9262 4774
_______________________________________________________________


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Thu Feb  5 21:38:55 1998
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:00:06 2004
Subject: Foreign object inclusion WAS: Namespaces, Architectural
  Forms, and Sub-Documents
Message-ID: <3.0.32.19980205133605.00a65ae4@pop.intergate.bc.ca>

David Megginson wrote:

> <object notation="uuencoded">
> begin 644 para
> M66]U(&%R92!Q=6ET92!R:6=H="!T:&%T('1H:7,@:7,@;&5G86P@6$U,(&]R
> M(%-'34P@+2T@=&AA="=S(&]N92!V86QI9`IU<V4@;V8@3D]4051)3TX@871T
> J<FEB=71E<RX@2&5R92=S('1H:7,@<&%R86=R87!H(%5514Y#3T1%1#H*
> `
> end
> </object>

Don't want to be pedantic, but for this to work you need at least
<object notation='uuencoded'>
<![CDATA[
begin 644 para
M66]U(&%R92!Q=6ET92!R:6=H="!T:&%T('1H:7,@:7,@;&5G86P@6$U,(&]R
M(%-'34P@+2T@=&AA="=S(&]N92!V86QI9`IU<V4@;V8@3D]4051)3TX@871T
J<FEB=71E<RX@2&5R92=S('1H:7,@<&%R86=R87!H(%5514Y#3T1%1#H*
`
end]]>
</object>

I'm sure you can see why.  But in the general case not even
that works, because uuencode will be sure to emit the occasional
"]]>".  Neither SGML nor XML really have any facilities designed
to support in-line inclusion of foreign objects.  Yes, this
is irritating. -Tim


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From gmckenzi at JetForm.com  Thu Feb  5 21:58:14 1998
From: gmckenzi at JetForm.com (Gavin McKenzie)
Date: Mon Jun  7 17:00:06 2004
Subject: Foreign object inclusion WAS: Namespaces, ArchitecturalForms, and Sub-Documents
Message-ID: <c=CA%a=_%p=JetForm%l=ROSSINI-980205215324Z-15784@rossini.jetform.com>


Isn't Base64 the fix to any fears associated with uuencoded resources
emitting an occasional ]]>?

Gavin.

>-----Original Message-----
>From:	Tim Bray [SMTP:tbray@textuality.com]
>Sent:	Thursday, February 05, 1998 4:39 PM
>To:	xml-dev@ic.ac.uk
>Subject:	Re: Foreign object inclusion WAS: Namespaces, ArchitecturalForms,
>and Sub-Documents
>
>David Megginson wrote:
>
>> <object notation="uuencoded">
>> begin 644 para
>> M66]U(&%R92!Q=6ET92!R:6=H="!T:&%T('1H:7,@:7,@;&5G86P@6$U,(&]R
>> M(%-'34P@+2T@=&AA="=S(&]N92!V86QI9`IU<V4@;V8@3D]4051)3TX@871T
>> J<FEB=71E<RX@2&5R92=S('1H:7,@<&%R86=R87!H(%5514Y#3T1%1#H*
>> `
>> end
>> </object>
>
>Don't want to be pedantic, but for this to work you need at least
><object notation='uuencoded'>
><![CDATA[
></object>
>
>I'm sure you can see why.  But in the general case not even
>that works, because uuencode will be sure to emit the occasional
>"]]>".  Neither SGML nor XML really have any facilities designed
>to support in-line inclusion of foreign objects.  Yes, this
>is irritating. -Tim
>
>
>xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
>Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
>To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
>(un)subscribe xml-dev
>To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
>subscribe xml-dev-digest
>List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
> << File: para >> 

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rdaniel at lanl.gov  Thu Feb  5 22:18:52 1998
From: rdaniel at lanl.gov (Ron Daniel Jr.)
Date: Mon Jun  7 17:00:06 2004
Subject: Foreign object inclusion WAS: Namespaces,
  ArchitecturalForms, and Sub-Documents
Message-ID: <3.0.32.19980205151412.009eb770@cic-mail.lanl.gov>

At 04:53 PM 2/5/98 -0500, Gavin McKenzie wrote:
>
>Isn't Base64 the fix to any fears associated with uuencoded resources
>emitting an occasional ]]>?

I think so, the allowed characters in Base-64 are A-Za-z0-9+/=.

There is a caveat, some base-64 encoders may assume they are
only used in MIME contexts, and thus "help" the programmer by
converting any text into MIME's "canonical form" (e.g. line ends
are converted to CRLF). So, be careful about that whitespace!

Ron Daniel Jr.              voice:+1 505 665 0597
Advanced Computing Lab        fax:+1 505 665 4939
MS B287                     email:rdaniel@lanl.gov
Los Alamos National Lab      http://www.acl.lanl.gov/~rdaniel
Los Alamos, NM, USA, 87545  

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From peter at ursus.demon.co.uk  Fri Feb  6 09:23:04 1998
From: peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun  7 17:00:06 2004
Subject: XML and the launch of Chemical Markup Language
In-Reply-To: <3.0.1.16.19980128150718.36a792b8@pop3.demon.co.uk>
Message-ID: <3.0.1.16.19980206092126.1eafbb32@pop3.demon.co.uk>

At 15:07 28/01/98, Peter Murray-Rust wrote:
>I have been invited to give a virtual lecture by VEI Ltd and Chemweb Ltd
>and I have taken the opportunity to "launch" Chemical Markup Language and
>also to promote the use of XML. Details are at:
>
>http://chemweb.vei.co.uk

The transcript of this lecture is - or will be - publicly available at this
address. Anyone registered is welcome to contribute to the discussion. I'd
welcome any corrections [I have deliberately simplified XML in places].
[There were two server-side breaks in transmission but I hope that anyone
who 'attended' was able to get all the material.

The 26 slides are also available at:
http://www.vsms.nottingham.ac.uk/vsms/talks/chemwebvei/001.html

which is the TOC. [The slides are deliberately not interlinked because of
the technology.] If you'd like to use material from these please let me know.

In passing I prepared the slides using conventional HTML editing tools
(Netscape). I kept thinking how it would have been preferable to use XML
for this and I think I was close to the break-even point for tooling up and
doing it in Java/XML. This would have solved renumbering problems, allowed
redesigned layouts to be transmitted to every slide, etc. I would have
still output the actual slides in HTML. I am a believer in using HTML for
presentations (since I feel it's more flexible/portable/re-usable than
other approaches). If other people feel the same way, perhaps we could
create a collaborative approach to XML/HTML-slide generation?

	P.

Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic
net connection
VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary
http://www.venus.co.uk/vhg

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From peter at ursus.demon.co.uk  Fri Feb  6 10:18:17 1998
From: peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun  7 17:00:06 2004
Subject: Foreign object inclusion WAS: Namespaces, Architectural
  Forms, and Sub-Documents
In-Reply-To: <34DA2DDA.5C725B41@allette.com.au>
References: <c=CA%a=_%p=JetForm%l=ROSSINI-980205134207Z-12699@rossini.jetform.com>
 <199802051409.JAA00365@unready.microstar.com>
Message-ID: <3.0.1.16.19980206084527.11573c56@pop3.demon.co.uk>

I am still unclear how to tackle this (very real) problem. I have sympathy
for people who wish to bundle everything into one document because I am not
yet happy that we have a completely robust system for bundling together all
components of a hyperdocument. [For example, how often do you "save HTML"
and find the GIFs are not included?].

When I first started trying to learn SGML I developed a system (costwish)
which UUENCODED gifs and other binaries into a single. Since I have no
experience of SGML in practice I don't know whether that is the normal
thing to do.

When I came across something like the following:

At 08:23 06/02/98 +1100, Marcus Carr wrote:
>David Megginson wrote:
>
>> You are quite right that this is legal XML or SGML -- that's one valid
use of
>> NOTATION attributes. Here's this paragraph UUENCODED:
>>
>> <object notation="uuencoded">
>> begin 644 para
>> M66]U(&%R92!Q=6ET92!R:6=H="!T:&%T('1H:7,@:7,@;&5G86P@6$U,(&]R
>> M(%-'34P@+2T@=&AA="=S(&]N92!V86QI9`IU<V4@;V8@3D]4051)3TX@871T
>> J<FEB=71E<RX@2&5R92=S('1H:7,@<&%R86=R87!H(%5514Y#3T1%1#H*
>> `
>> end
>> </object>

I converted all the & to &amp; and the < to &lt; 

I'm not clear why this isn't a useful method since the processor is
required to convert them on reading.

I have a problem to know what to do with "save XML" on JUMBO. In the
SAXDemo routine characters(), DavidM converts non printing chars to escaped
variants *e.g. asc(10) -> &#10; , but does *not* convert & to &amp; This
means that any XML file that contains & will produce invalid XML output.

What is the appropriate strategy? Should a "save XML" application convert 
all five chars (&, <, >, ', ") to their escaped equivalents? Or none? Or
just the first two. [In my own community I don't think using <![CDATA[ is a
good idea because people won't have any idea what is going on and they will
get it wrong.  In any case - as pointed out - it doesn't overcome the
random occurrence of ']]>' ].
 >

	P.

Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic
net connection
VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary
http://www.venus.co.uk/vhg

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From peter at ursus.demon.co.uk  Fri Feb  6 10:38:51 1998
From: peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun  7 17:00:06 2004
Subject: Namespaces, Architectural Forms, and Sub-Documents
In-Reply-To: <199802041733.MAA02120@unready.microstar.com>
References: <199802041632.LAA14809@geode.ora.com>
 <199802041506.KAA00956@unready.microstar.com>
 <199802041632.LAA14809@geode.ora.com>
Message-ID: <3.0.1.16.19980206082831.1157496c@pop3.demon.co.uk>

At 12:33 04/02/98 -0500, David Megginson wrote:
>
>Not at all -- you just need a single element type to hold references
>to other XML documents.  You could even (though this is disgusting)
>use
>
>  <img src="equation1.xml">
>

I hope that the "disgusting" refers to the use of 'img' and 'src' and the
implied semantics rather than the mechanism :-).  I am an advocate of the
*mechanism* (e.g
http://www.vsms.nottingham.ac.uk/vsms/talks/chemwebvei/020.html) where I
use XML-LINK explicitly to combine chemistry, maths and text. This has the
advantage that it avoids namespace problems. It also allows me to process
foreign files if certain assumptions are made.

	P.
Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic
net connection
VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary
http://www.venus.co.uk/vhg

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at quake.net  Fri Feb  6 11:33:41 1998
From: donpark at quake.net (Don Park)
Date: Mon Jun  7 17:00:06 2004
Subject: Foreign object inclusion WAS: Namespaces, Architectural Forms, and Sub-Documents
Message-ID: <002501bd32f2$507a2ae0$2ee044c6@donpark>

As far as I can see there are two problems:

1. Embedding Binary Data inside XML document

This problem is solved with BASE64.  I wish we can specify it in the DTD but
its workable now.

2. Binding XML and its related files into a single package

MHTML works pretty well in 'document' oriented problems and there is no
reason why we can not adopt it.  Lets call it MXML and just go with it.  In
non-document oriented problems, MHTML does not work too well because data is
laid out sequentially rather than multiplexed to reduce latency.  I guess it
will be a while befoer WebTV uses XML in a major way...

Don Park
http://www.quake.net/~donpark/index.html


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From peter at ursus.demon.co.uk  Fri Feb  6 11:52:51 1998
From: peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun  7 17:00:06 2004
Subject: LISTRIVIA: an apology
In-Reply-To: <199802051744.JAA19700@boethius.eng.sun.com>
References: <3.0.1.16.19980205102838.2e4712de@pop3.demon.co.uk>
Message-ID: <3.0.1.16.19980206114841.2897d378@pop3.demon.co.uk>

At 09:44 05/02/98 -0800, Jon Bosak wrote:

>(I am replying separately to both lists because the message was
>cross-posted.  PLEASE DO NOT CROSS-POST BETWEEN THE W3C-XML-SIG LIST
>AND THE XML-DEV LIST.)

This was my fault - through sloppy replying to a multiple posting. Since I
have stressed the importance of list behaviour I feel ashamed at having
slipped from my own guidelines.

	P.

Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic
net connection
VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary
http://www.venus.co.uk/vhg

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ricko at allette.com.au  Fri Feb  6 12:58:49 1998
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun  7 17:00:06 2004
Subject: Foreign object inclusion WAS: Namespaces, Architectural Forms, and Sub-Documents
Message-ID: <199802061308.AAA20028@jawa.chilli.net.au>


> From: Don Park <donpark@quake.net>

> As far as I can see there are two problems:
> 
> 1. Embedding Binary Data inside XML document
> 
> This problem is solved with BASE64.  I wish we can specify it in the DTD but
> its workable now.

You can. For example

<!NOTATION base64 SYSTEM "http://www.somewhere.com/base64-decoder.applet">
<!ELEMENT BINARY ( #PCDATA )>
<!ATTLIST BINARY
	encoding NOTATION ( base64 ) "base64" >
...
<BINARY>...</BINARY>

An element can have one NOTATION attribute, which specifies how to interpret
the element's data. Often this is used to restrict possible notations to
lists of types, for example

<!ATTLIST figure
	type NOTATION ( gif | epsi | cgm | jpeg ) #REQUIRED >

Developers of generic XML tools should make sure that their systems
provide ways to interpret NOTATION attributes appropriately: it is
a mechnism like MIME media-types, but may be on a finer grain. It is
not a mechanism for multi-part documents (unless the DTD is a DTD
for representing multipart documents of course) because the notation
processor (which the SYSTEM identifier on the NOTATION declaration
would identify) runs after the XML processor.

 
Rick Jelliffe 

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From peter at ursus.demon.co.uk  Fri Feb  6 13:40:19 1998
From: peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun  7 17:00:06 2004
Subject: Foreign object inclusion WAS: Namespaces, Architectural   
  Forms, and Sub-Documents
Message-ID: <3.0.1.16.19980206132732.312792ae@pop3.demon.co.uk>

Posted on behalf of Ross Moore

>Return-Path: <ross@mpce.mq.edu.au>
>X-Sender: ross@zeus.mpce.mq.edu.au
>Date: Fri, 6 Feb 1998 21:43:43 +1100
>To: peter@ursus.demon.co.uk
>From: Ross Moore <ross@mpce.mq.edu.au>
>Subject: Re: Foreign object inclusion WAS: Namespaces, Architectural   
> Forms, and Sub-Documents
>
[... request for posting ...]
>
>
>After receiving Tim's last posting I engaged in an email conversation
>with myself, attached here...
>
>At 10:57 AM +1100 2/6/98, Ross Moore wrote:
>>Hello Tim
>>
>>Is there any reason why a mailer produced 8 copies of the message +
attachment
>>from you (appended below) ?
>>It was  Eudora Pro for Macintosh PPC (which automatically decoded OK).
>>The Unix mail-server only received 1 copy.
>>
>>Could it be that the contents has triggered a side-effect,
>>detrimental to an external structure ...
>>
>>Marcus Carr wrote:
>>>> In other words, inlining uuencoded objects is a kludge...
>>>
>>>Unless you plan to write an application to confirm that your embedded
>>>fragments
>>>aren't detrimental to your structure, I would advise against this. Even
>>>if the
>>>fragment wasn't detrimental to your structure, it may be to someone who
>>>wants to
>>>reuse a chunk of your data, adding a dangerous level of uncertainty to your
>>>documents.
>>
>>
>>If this reply causes a similar repetition, then we'll know that such
>>problems indeed can exist.   ;-)
>
>Yes indeed there is such a problem, because the quoted base 64 portion
>is being regarded as an attachment needing decoding.
>It no longer has the correct checksum, due to the quoting with `> '.
>
>The automatic POP retreive from the Unix server was failing.
>Each 20 mins (or so) it tries again and also fails.
>The 8 copies simply counts how many times it tried before I could
>address the problem manually.
>
>
>> `b`e`g`i`n 644 para
>> M66]U(&%R92!Q=6ET92!R:6=H="!T:&%T('1H:7,@:7,@;&5G86P@6$U,(&]R
>> M(%-'34P@+2T@=&AA="=S(&]N92!V86QI9`IU<V4@;V8@3D]4051)3TX@871T
>> J<FEB=71E<RX@2&5R92=S('1H:7,@<&%R86=R87!H(%5514Y#3T1%1#H*
>> `
>> end
>
>(Here I've doctored the  `begin' into `b`e`g`i`n  to prevent this
>happening again.)
>
>
>[added later]
>Doubled `>'s, as in Peter's last mail, do not cause this effect:
>
>>> <object notation="uuencoded">
>>> begin 644 para
>>> M66]U(&%R92!Q=6ET92!R:6=H="!T:&%T('1H:7,@:7,@;&5G86P@6$U,(&]R
>>> M(%-'34P@+2T@=&AA="=S(&]N92!V86QI9`IU<V4@;V8@3D]4051)3TX@871T
>>> J<FEB=71E<RX@2&5R92=S('1H:7,@<&%R86=R87!H(%5514Y#3T1%1#H*
>>> `
>>> end
>>> </object>
>
>
>
>Is there a lesson here ?
>
>
>Regards,
>
>	Ross Moore
>
>
>~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>Ross Moore                             email: ross@mpce.mq.edu.au
>Mathematics Department                 phone:      +612 9850 8955
>Macquarie University                     fax:      +612 9850 8114
>Sydney, NSW 2109                    Internet:
>Australia                   http://www-math.mpce.mq.edu.au/~ross/
>
>                ***************************
>
>for the best in (La)TeX-nical typesetting and Web page production
>join the  TeX Users Group (TUG) --- browse at  http://www.tug.org
>
>                 <ross.moore@mail.tug.org>
>~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
>
>
Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic
net connection
VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary
http://www.venus.co.uk/vhg

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From PaquinM at novasys.qc.ca  Fri Feb  6 14:34:38 1998
From: PaquinM at novasys.qc.ca (Paquin, Martin)
Date: Mon Jun  7 17:00:06 2004
Subject: XML and the launch of Chemical Markup Language
Message-ID: <7183BFDBEB50D111863C080009B453804C73@nemesis.novasys.qc.ca>

	-----Original Message-----
	From:	Peter Murray-Rust [SMTP:peter@ursus.demon.co.uk]
	Sent:	Friday, February 06, 1998 4:21 AM
	To:	xml-dev@ic.ac.uk
	Subject:	Re: XML and the launch of Chemical Markup
Language

>I am a believer in using HTML for
>presentations (since I feel it's more flexible/portable/re-usable than
>other approaches). If other people feel the same way, perhaps we could
>create a collaborative approach to XML/HTML-slide generation?
Microsoft annouced his intention to have a XML export format for all
his office applications, including I supposed PowerPoint. Certainly a
place 
to look. 

The major part that is missing for conversing a graphic presentation to
XML is 
the possibility to create graphics primitives in HTML. Other than that
with
dynamic html is possible to have presentation as good than with
conventionnnel
prsentation package.

_____________________________________________________________

Martin Paquin               Novasys, Inc.
Consultant                  bureau 2624

                            Tour de la Bourse,

                            800, Place Victoria
Tel.:(514)875-7720          Case postale 151
Fax.:(514)874-9830          Montr?al (Qu?bec)
paquinm@novasys.qc.ca       CANADA H4Z 1C3
http://www.novasys.qc.ca    


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ak117 at freenet.carleton.ca  Fri Feb  6 14:57:05 1998
From: ak117 at freenet.carleton.ca (David Megginson)
Date: Mon Jun  7 17:00:06 2004
Subject: "Save as XML"
In-Reply-To: <3.0.1.16.19980206084527.11573c56@pop3.demon.co.uk>
References: <c=CA%a=_%p=JetForm%l=ROSSINI-980205134207Z-12699@rossini.jetform.com>
	<199802051409.JAA00365@unready.microstar.com>
	<34DA2DDA.5C725B41@allette.com.au>
	<3.0.1.16.19980206084527.11573c56@pop3.demon.co.uk>
Message-ID: <199802061453.JAA00386@unready.microstar.com>

Peter Murray-Rust writes:

 > I have a problem to know what to do with "save XML" on JUMBO. In
 > the SAXDemo routine characters(), DavidM converts non printing
 > chars to escaped variants *e.g. asc(10) -> &#10; , but does *not*
 > convert & to &amp; This means that any XML file that contains &
 > will produce invalid XML output.

Sorry for any confusion there -- I had originally used '\n' and '\r',
then decided to use character references to be more XML-like.  I
realise, though, that that gives the unintended appearance of an
attempt to produce XML-parseable character data.  Perhaps I should go
back to C-like escapes.

 > What is the appropriate strategy? Should a "save XML" application
 > convert all five chars (&, <, >, ', ") to their escaped
 > equivalents? Or none? Or just the first two. [In my own community I
 > don't think using <![CDATA[ is a good idea because people won't
 > have any idea what is going on and they will get it wrong.  In any
 > case - as pointed out - it doesn't overcome the random occurrence
 > of ']]>' ].

This taps into an earlier discussion about what is an is not
significant information in an XML document.  For example, if the
general entity &name; is set to "David Megginson", then the following
two fragments are exactly equivalent for many XML applications:

FRAGMENT 1:

  <x


  y="z">My name is<--

  here's a comment


  --> &name;.</x>


FRAGMENT 2:

  <x y="z">My name is David Megginson.</x>


Some authoring and repository tools, however, will want to preserve
the general entity reference, the comment, and the whitespace (even
inside the start tag).  In SGML, you can use grove plans to specify
what information is and is not significant to an application -- but
there is still a lack of detailed standards for the information set
(or sets) returned by an XML parser.


All the best,


David

-- 
David Megginson                 ak117@freenet.carleton.ca
Microstar Software Ltd.         dmeggins@microstar.com
      http://home.sprynet.com/sprynet/dmeggins/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From trevort at za.ibm.com  Fri Feb  6 15:15:52 1998
From: trevort at za.ibm.com (Trevor Turton)
Date: Mon Jun  7 17:00:06 2004
Subject: Meta data for XML editors
Message-ID: <5060200010923033000002L032*@MHS>

Development of XML editors is underway.  Since XML is flexible/extensible,
XML editors will have to be so too.  The XML user will need to specify a
list of the DTD schemas that will be used in composing a particular
document.  The editor will need to fetch and parse these schemas so it can
validate the user's input.

More usefully, the editor could present the DTDs to the user as a series of
stacked palettes, each containing a list of the elements defined within
each DTD.  When the user selects an element from a palette for inclusion in
the target document, the editor could present the tags and attributes
associated with the element, and hence guide the user in constructing a
syntactically correct document that conforms to the DTDs.

Let's up the ante a little.  A palette of entities would be more useful if:

* Each entity were represented by an icon that suggests its function
* Each entity popped up a one-liner outlining its function whenever the
  mouse hovered over it for a while
* Each entity was backed by complete help documentation (in XML, of course)

To do this stuff well, the editor would need access to more than just the
plain unvarnished DTD.  It would need extra meta data to be associated
with the DTD, but only used at document composition time.  Browsers and
other rendering programs would not need to access this extra information
when they render the final document, and indeed it would slow them down
unnecessarily to do so.  It may make sense to exploit XML's (proposed)
powerful hyperlink facilities to associate compose-time meta data with
DTDs.  All of the design-time meta data required to help the user understand
and exploit the DTD could be made available in this way.

If this meta data is made available through hyperlinks then it may be a good
idea to establish a convention now, while it's still early enough, as to how
such compose time meta data will be classified, and to encourage the
builders of browsers and other rendering engines to omit these designated
hyperlinks from the popup menus they present to their users should the user
click on the associated hot-spot; or at least to make this omission the
default action, overrideable in the browser's option settings.

It seems likely to me that a number of different software developers will
build XML editors that make use of associated compose-time meta data
such as I have described above, and that each will choose to format this
information in a different way, and that DTD builders will be faced with
the dilemma of which meta data format they should use, and that the value
of all DTDs will be diminished by the fact that different XML editors will
work best with different formats of meta data.

Can we try to pre-empt this problem before it hits us by debating and
proposing a standard format for compose-time meta data?

Trevor Turton

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tyler at infinet.com  Fri Feb  6 15:21:41 1998
From: tyler at infinet.com (Tyler Baker)
Date: Mon Jun  7 17:00:06 2004
Subject: XML Java IO Writer...
Message-ID: <34DB2B73.50C6DD90@infinet.com>

A while back I brought up the idea of having an XML InputStream which
inherits FilterInputStream and did not get much response.  Anyways, it
would of been better for this to be an XMLReader since the Reader
classes handle all of the nitty gritty character conversion for you in
the first place.

Well due to the demands of the application I am writing now, it needs to
format as well as parse XML data from a variety of streams so for now I
am proposing that aside from just SAX, we have an IO package in the
org.xml domain which could possibly have parsers assigned to them.  I
spent about half a day and wrote an XMLWriter class which is an
extension of FilterInputStream for preparing XML Documents from Java
without having to do a lot of OutputStream.write() calls manually from
line to line.  Things are packaged right now under org.xml.io, but that
is only tentatively as I do not have any real permission from the guy
who owns the rights to xml.org (I can't remember who you are so maybe
this will get your attention).  Right now I am using this in my own
application and it works beautifully.  This is what I would call a 0.1
version since it does not handle all sorts of things like Notations and
lots of other stuff.

You can get the zip file with source code included.at
http://www.infinet.com/~tyler/xml/xmlio01.zip

Here is a brief description of the classes included:

package org.xml.io;

import java.io.Writer;
import java.io.FilterWriter;
import java.io.IOException;
import java.util.Hashtable;

public class XMLWriter extends FilterWriter {
  public XMLWriter(Writer out, String padding) {}

  public void writeDocument(Element rootElement, String ID) throws
IOException {}
  public void writeDocument(Element rootElement, String ID, Entity[]
entities) throws IOException {}
  public void writeDocument(Element rootElement, String ID, Entity[]
entities, boolean system) throws IOException {
  private String replaceText(String content) {}
}

This class takes as another argument, another Writer and a String which
is essentially used for padding the nested levels of your document.  For
example, you could use two spaces as padding or else just a tab.

You would create this class by making a call like this:

      XMLWriter writer = new XMLWriter(new OutputStreamWriter(out), "
");

where out is of type OutputStream.  To write a document you call
writeDocument() which takes three forms.  writeDocument(Element
rootElement, String ID) is the same as writeDocument(Element
rootElement, String ID, null, true) and writeDocument(Element
rootElement, String ID, Entity[] entities) is the same as
writeDocument(Element rootElement, String ID, Entity[] entities, true);

writeDocument(Element rootElement, String ID, Entity[] entities, boolean
system) is what is actually called.

The element type is the root element you write, ID is the system or
public ID of the DTD, entities are an array of type Entity[] which is
used to replace in the document, and system is a flag indicating whether
ID should be treated as a system ID or a public ID.  So in my code I
call this (the class calling this is of type element).

      writer.writeDocument(this, "forumReference.dtd");


package org.xml.io;

public interface Element {
  String getName();

  // may return null
  String getContent();

  // may return null
  Attribute[] getAttributes();

  // may return null
  Element[] getChildren();

  // may return null
  String getComments();
}

This interface defines an element type.  Usually you implement this for
each class which has data that can be mapped to an XML document.  If on
the other hand you have a class which should not be inherited or is even
final, then use inner classes to solve your problem.  For example, for
java.net.InetAddress I use this in my code to implement the Element[]
getChildren method.  I use AbstractElement (included on org.xml.io) so I
only have to redefine the methods that can return null anyways.

  public Element[] getChildren() {
    Element[] children = new Element[4];

    children[0] = new AbstractElement() {
      public String getName() {
        return "id";
      }
      public String getContent() {
        return ID;
      }
    };

    children[1] = new AbstractElement() {
      public String getName() {
        return "host";
      }
      public String getContent() {
        return host.getHostAddress();
      }
    };

    children[2] = new AbstractElement() {
      public String getName() {
        return "port";
      }
      public String getContent() {
        return String.valueOf(port);
      }
    };

    children[3] = new AbstractElement() {
      public String getName() {
        return "ior";
      }
      public String getContent() {
        return IOR;
      }
    };

    return children;
  }


package org.xml.io;

public interface Entity {
  String getName();
  String getValue();
}

The Entity class is basically for replacement text of course.  What it
will do is replace any occurrences of getValue() in the rest of the
stream with a '&' prepended and a ';' appended to getName().  when you
call writeDocument() every entity passed will be checked in the
document.  This is an expensive operation so use it wisely.  If null is
passed as the Entity[] argument to writeDocument, then no checks will
occur.


The XMLWriter class will recursively descent from the root element to
all sub elements and get their content and write it out.  This package
is something I spent half a day on to basically get the job done for
what I needed to do, and I would more than happily GPL it all later on
under the xml.org.io package if given permission and there is interest
in doing more (fixing bugs and adding complete XML functionality).
Right now it all works for what I am doing and chopped about 300 lines
of code (what I guess people call report writing) out of my
application.  The less code in my app, the better.  Which also makes me
ask is it better to have a parser which may be large in code size, but
is easy to use so my production code is small, or a parser with little
functionality that makes my production code large.  Of course you can
sometimes have the best of both worlds.

Tyler

Tyler


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From papresco at technologist.com  Fri Feb  6 21:55:48 1998
From: papresco at technologist.com (Paul Prescod)
Date: Mon Jun  7 17:00:06 2004
Subject: Meta data for XML editors
References: <5060200010923033000002L032*@MHS>
Message-ID: <34DB4F51.EEC875D1@technologist.com>

Trevor Turton wrote:
> 
> Can we try to pre-empt this problem before it hits us by debating and
> proposing a standard format for compose-time meta data?

Such a standard should build upon RDF or XML-Data, so I would propose
that it is best to wait until those are a little more "real."

 Paul Prescod
--
http://itrc.uwaterloo.ca/~papresco


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at quake.net  Sat Feb  7 02:30:56 1998
From: donpark at quake.net (Don Park)
Date: Mon Jun  7 17:00:06 2004
Subject: Last minute request for BASE64 section support in XML 1.0
Message-ID: <000401bd336f$c33133d0$2ee044c6@donpark>

It looks like XML is about to be approved as standard by W3C.  Could we
please have BASE64 sections as a part of XML standard 1.0?  Everyone who
support this idea, please reply to this message (short replies please to
avoid LISTRIVIA).

[to be inserted somewhere between 2.7 and 2.8 of XML spec]

Using BASE64 sections

<![BASE64[
R0lGODlhdQAgAPcAAP//////zP//mf//Zv//M///AP/M///MzP/Mmf/MZv/MM//MAP+Z//+Z
zP+Zmf+ZZv+ZM/+ZAP9m//9mzP9mmf9mZv9mM/9mAP8z//8zzP8zmf8zZv8zM/8zAP8A//8A
]]>

I am not sure if conflicts badly with SGML but, if not, it could be
immensely helpful for developers.

Regarding SAX, we could have a callback for binary data similar to the
characters() callback.  As far as W3C DOM is concerned, we will need a
another Node type (BINARY).

Sincerely,

Don Park
http://www.quake.net/~donpark/index.html


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jjc at jclark.com  Sat Feb  7 04:14:58 1998
From: jjc at jclark.com (James Clark)
Date: Mon Jun  7 17:00:06 2004
Subject: Last minute request for BASE64 section support in XML 1.0
References: <000401bd336f$c33133d0$2ee044c6@donpark>
Message-ID: <34DBDF21.AE382FB4@jclark.com>

Don Park wrote:
> 
> It looks like XML is about to be approved as standard by W3C.  Could we
> please have BASE64 sections as a part of XML standard 1.0?

There is no chance of such a change being made at this stage.

James


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From thillai at ix.netcom.com  Sat Feb  7 04:28:49 1998
From: thillai at ix.netcom.com (Thillai)
Date: Mon Jun  7 17:00:06 2004
Subject: DOM for XML
Message-ID: <01BD3355.6B9B0480@nbw-nj9-40.ix.netcom.com>

Is there any DOM implementation for XML?

Thillai


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at quake.net  Sat Feb  7 08:16:43 1998
From: donpark at quake.net (Don Park)
Date: Mon Jun  7 17:00:06 2004
Subject: Last minute request for BASE64 section support in XML 1.0
Message-ID: <000901bd33a0$10e44060$2ee044c6@donpark>

James,

>> It looks like XML is about to be approved as standard by W3C. Could we
>> please have BASE64 sections as a part of XML standard 1.0?
>
>There is no chance of such a change being made at this stage.

Could you please elaborate on why there is no chance of BASE64 section
proposal being accepted?  As far as I know, XML has not been approved.
Whether or not W3C has already written out the approval announcement, the
fact is that it is not approved yet.  If your assessment of the probability
is based the lack of time for the XML-WG to consider such a proposal, I must
beg to differ with you.  As far as I know, the WG serves the community and,
while its activity must be constrained by the schedule of its members, the
need of the community must be met if the need is worthy enough.

I am not sure if support for embedded binary data has been brought up in the
WG but I am, frankly, very disappointed with the lack of support.  CDATA is
awfully inadequate.  The Open Trading Protocol (OTP) proposal has a need to
embed signature within OTP documents and it uses <![CDATA[ for embedding.
For occasional occurrance of ]]>, OTP states:

"Any CDATA end sequences ("]]>") within the data are replaced by
"]]]]><![CDATA[>" in order to escape
the CDATA end sequence"

Am I the only one who thinks this is pure madness for a yet-to-be-approved
standard that proposes to be the next generation data formating language?
If I did not bring up the subject before, it is because I 'trusted' the WG
to find a solution to a problem which I assumed was too significant to
ignore.  Perhaps the problems I am concerned about are not of major concern
for the WG members.  Is it only me who worries about the problem of handling
endless data whose existance is defined only by its movement?  Is TV
broadcast not a document?

Whether or not my proposal is accepted or not, I would like to know if
others feel that better support for binary data is needed or not.  And I
would like to ask the members of the WG to consider the issue and the need
some of us have.  Delay of one to two week is, IMHO, well worth it.

Sincerely,

Don Park
http://www.quake.net/~donpark/index.html


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jjc at jclark.com  Sat Feb  7 08:43:37 1998
From: jjc at jclark.com (James Clark)
Date: Mon Jun  7 17:00:06 2004
Subject: Last minute request for BASE64 section support in XML 1.0
References: <000901bd33a0$10e44060$2ee044c6@donpark>
Message-ID: <34DC1DEC.A20A1993@jclark.com>

Don Park wrote:

> >> It looks like XML is about to be approved as standard by W3C. Could we
> >> please have BASE64 sections as a part of XML standard 1.0?
> >
> >There is no chance of such a change being made at this stage.
> 
> Could you please elaborate on why there is no chance of BASE64 section
> proposal being accepted?  As far as I know, XML has not been approved.

XML 1.0 is already a Proposed Recommendation. The W3C process does not
allow major new features to be added between Proposed Recommendation and
Recommendation.

James


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ak117 at freenet.carleton.ca  Sat Feb  7 12:17:07 1998
From: ak117 at freenet.carleton.ca (David Megginson)
Date: Mon Jun  7 17:00:06 2004
Subject: DOM for XML
In-Reply-To: <01BD3355.6B9B0480@nbw-nj9-40.ix.netcom.com>
References: <01BD3355.6B9B0480@nbw-nj9-40.ix.netcom.com>
Message-ID: <199802071217.HAA00568@unready.microstar.com>

Thillai writes:

 > Is there any DOM implementation for XML?

The DOM isn't finished, so any implementation is necessarily
tentative.  With that warning, however, you can look at

  http://www.quake.net/~donpark/saxdom.html

The nice thing about Don's work is that SAXDOM will run with any
SAX-conformant Java XML parser, so you can use NXP, Lark, MSXML,
AElfred, and/or XP, as you wish.  Don also includes some information
about integrating the DOM with the new, standard Java Swing widgets.


All the best,


David

-- 
David Megginson                 ak117@freenet.carleton.ca
Microstar Software Ltd.         dmeggins@microstar.com
      http://home.sprynet.com/sprynet/dmeggins/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Sat Feb  7 15:34:17 1998
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:00:06 2004
Subject: Last minute request for BASE64 section support in XML 1.0
Message-ID: <3.0.32.19980207072801.00ab7864@pop.intergate.bc.ca>

At 12:11 AM 07/02/98 -0800, Don Park wrote:
>Could you please elaborate on why there is no chance of BASE64 section
>proposal being accepted?

Simply put, we create standards following a set of formal rules, which is
strictly necessary if you are to have any hope of getting Netscape, Micrsoft,
Sun, et al, to go into a room and come out with a real result.  The rules
do not, at this point in time, leave room for the introduction of major
new features.

Having said that, I think that it would be a good idea for someone to
write up a proposal for the use of a reserved attribute or namespace to
signal, as a convention in XML 1.0, that the contents of an element are
base64 encoded.  This could be destined for XML 1.1 or perhaps serve
as a standalone recommendation layered on top of XML.

Nobody disputes that there is a real need in this area. -Tim


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From papresco at technologist.com  Sat Feb  7 15:35:49 1998
From: papresco at technologist.com (Paul Prescod)
Date: Mon Jun  7 17:00:06 2004
Subject: Last minute request for BASE64 section support in XML 1.0
References: <000901bd33a0$10e44060$2ee044c6@donpark>
Message-ID: <34DC7FA0.481781EB@technologist.com>

Don Park wrote:
> 
> James,
> Could you please elaborate on why there is no chance of BASE64 section
> proposal being accepted?  As far as I know, XML has not been approved.
> Whether or not W3C has already written out the approval announcement, the
> fact is that it is not approved yet.  If your assessment of the probability
> is based the lack of time for the XML-WG to consider such a proposal, I must
> beg to differ with you.  As far as I know, the WG serves the community and,
> while its activity must be constrained by the schedule of its members, the
> need of the community must be met if the need is worthy enough.

The W3C only serves the needs of the community indirectly through
serving the needs of its members. The community has no official standing
in the process.

In this particular case, if we held up XML 1.0 for everything someone
considered important, it would never ship. That's what happened to HTML
3.0. I'm not denying that your complaint is important -- XML has many
large flaws. That's life in the standards process.

 Paul Prescod
--
http://itrc.uwaterloo.ca/~papresco


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Jon.Bosak at eng.Sun.COM  Sat Feb  7 16:45:38 1998
From: Jon.Bosak at eng.Sun.COM (Jon Bosak)
Date: Mon Jun  7 17:00:06 2004
Subject: Last minute request for BASE64 section support in XML 1.0
In-Reply-To: <000401bd336f$c33133d0$2ee044c6@donpark>
Message-ID: <199802071643.IAA21218@boethius.eng.sun.com>

[Don Park:]

| It looks like XML is about to be approved as standard by W3C.  Could
| we please have BASE64 sections as a part of XML standard 1.0?
| Everyone who support this idea, please reply to this message (short
| replies please to avoid LISTRIVIA).

Don't bother.  Under W3C procedure, XML 1.0 has been substantively
frozen since the Proposed Recommendation went out for member balloting
on December 8.  Substantive changes will have to wait for XML 1.1.

Jon


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Sat Feb  7 19:33:05 1998
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:00:06 2004
Subject: file URLs again
Message-ID: <3.0.32.19980207113030.00ab0c90@pop.intergate.bc.ca>

Hi, I've been getting a bit behind... did this group in its
collective wisdom come up with a snippet of Java that makes
a really good and sincere effort to open a URL that looks
like "spec.dtd" and works reliably on MS & other OS's, with
more than one JVM?  I seem to recall seeing one go by, but
can't find it. -Tim


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at quake.net  Sat Feb  7 22:10:06 1998
From: donpark at quake.net (Don Park)
Date: Mon Jun  7 17:00:06 2004
Subject: file URLs again
Message-ID: <000d01bd3414$7c720780$2ee044c6@donpark>

Tim,

Try this:

public URL createFileURL (String fileName) {
    File file = new File(fileName);
    try {
        String path = file.getAbsolutePath();
        char sep = File.separatorChar;
        if (sep != '/')
            path = path.replace(sep, '/');
        if (path.charAt(0) == '/')
            path = "file://" + path;
        else
            path = "file:///" + path;
        return new URL(path);
    }
    catch (MalformedURLException e) {
        return null;
    }
}

I wish File.getCanonicalPath() could have been used instead of
getAbsolutePath() but it throws exception if the file does not exist.  If
that is the behavior you want, replace getAbsolutePath() with
getCanonicalPath().

I have used File.separatorChar instead of File.separator or even
getProperty("file.separator") because I don't know of any system that has
multicharacter separators.  It will be a lot more messy if you want to
handle that case as well.

Hope this helps,

Don Park
http://www.quake.net/~donpark/index.html


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ak117 at freenet.carleton.ca  Sun Feb  8 12:38:59 1998
From: ak117 at freenet.carleton.ca (David Megginson)
Date: Mon Jun  7 17:00:06 2004
Subject: file URLs again
In-Reply-To: <3.0.32.19980207113030.00ab0c90@pop.intergate.bc.ca>
References: <3.0.32.19980207113030.00ab0c90@pop.intergate.bc.ca>
Message-ID: <199802072149.QAA00313@unready.microstar.com>

Tim Bray writes:

 > Hi, I've been getting a bit behind... did this group in its
 > collective wisdom come up with a snippet of Java that makes
 > a really good and sincere effort to open a URL that looks
 > like "spec.dtd" and works reliably on MS & other OS's, with
 > more than one JVM?  I seem to recall seeing one go by, but
 > can't find it. -Tim

This one's from the latest SAXDemo.java, incorporating modifications
suggested by James Clark:

  /**
    * If a URL is relative, make it absolute against the current directory.
    */
  private static String makeAbsoluteURL (String url)
    throws java.net.MalformedURLException
  {
    URL baseURL;

    String currentDirectory = System.getProperty("user.dir");
    String fileSep = System.getProperty("file.separator");
    String file = currentDirectory.replace(fileSep.charAt(0), '/') + '/';

    if (file.charAt(0) != '/') {
      file = "/" + file;
    }
    baseURL = new URL("file", null, file);

    return new URL(baseURL, url).toString();
  }


All the best,


David

-- 
David Megginson                 ak117@freenet.carleton.ca
Microstar Software Ltd.         dmeggins@microstar.com
      http://home.sprynet.com/sprynet/dmeggins/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From digitome at iol.ie  Sun Feb  8 14:44:36 1998
From: digitome at iol.ie (Sean Mc Grath)
Date: Mon Jun  7 17:00:06 2004
Subject: GEDCOM - A Killer XML Application?
Message-ID: <199802081444.OAA27250@mail.iol.ie>

I have been wandering the Web searching for my Wife's relatives
(surname Kilcawley. Know anyone?) and have learned very quickly that there
is a *huge* amount of genealogy stuff/activity on the Web.

Most of it revolves around a genealogy file format called GEDCOM
that apparantly originated with the Church of the Latter Day Saints.

This is a snippet of Gedcom:

 1 NAME Archibald  /BARD_(Beard)/
 1 SEX M
 1 BIRT
 2 DATE SEENOTES
 2 PLAC Antrim,Ireland
 1 DEAT
 2 DATE    FEB 1765

Sure looks like a cool XML application to me! I mean,the whole
point of these GEDCOM files is publishing/interchange of
genealogy data. Richly structured hierarchies. Oodles of
scope to show of spiffy XLL linking, spiffy XSL rendering,
intelligent search agents. The whole nine yards.

Has anyone looked into this? If not, anyone interested
in helping to get a ball rolling?


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From smith at interlog.com  Mon Feb  9 05:10:17 1998
From: smith at interlog.com (Chris Smith)
Date: Mon Jun  7 17:00:07 2004
Subject: Open Trading Protocol (CDATA, etc)(was BASE64 section support)
In-Reply-To: <000901bd33a0$10e44060$2ee044c6@donpark>
Message-ID: <Pine.BSI.3.95.980208234845.4946C-100000@shell1.interlog.com>

On Sat, 7 Feb 1998, Don Park wrote:

> I am not sure if support for embedded binary data has been brought up in the
> WG but I am, frankly, very disappointed with the lack of support.  CDATA is
> awfully inadequate.  The Open Trading Protocol (OTP) proposal has a need to
> embed signature within OTP documents and it uses <![CDATA[ for embedding.
> For occasional occurrance of ]]>, OTP states:
> 
> "Any CDATA end sequences ("]]>") within the data are replaced by
> "]]]]><![CDATA[>" in order to escape
> the CDATA end sequence"

I think you are lifting this a little out of context. The item you
referenced is from the specification on canonicalization. As well, it
was one of several design choices going into OTPv0.9, which are likely
to be the subject of cooler heads. More to the point, that item refers
to *all* data in elements.

A more relevant area is what the Open Trading Protocol does NOT
handled. The best example here is order description (often known as
Invoice). We felt that we could never handle all needs, and we needed
to allow for both simple and complex solutions, and both current and
future solutions.

As a result, the element content is ANY, while we have a ContentFormat
attribute that lets you indicate the following: XML, PCDATA, BASE64,
HTML, MIME, plus a user-defined option. (There is a remaining
discussion topic re splitting into ContentFormat and ContentEncoding,
which I hope is actually accomplished.)

This, I think, is a reasonable compromise. Although it does not lock
down the protocol completely, making implementation more difficult, it
allows for XMl/EDI, simple plain text, and HTML browser displayed text
or graphic offers (yes - you could essentially have an invoice that
contained a picture of the item you are purchasing).

(for more details, http://www.otp.org )

In our terms, all a <![BASE64[ construct would allow is the possible
removal of a ContentEncoding layer.

---------------------------------------------------------------------------
 Chris Smith                                          <smith@interlog.com>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at quake.net  Mon Feb  9 06:32:04 1998
From: donpark at quake.net (Don Park)
Date: Mon Jun  7 17:00:07 2004
Subject: Encoded XML Content -- was Re: Open Trading Protocol (CDATA, etc)(was BASE64 section support)
Message-ID: <001201bd3523$ba543450$2ee044c6@donpark>

Chris,

>From the responses I have gotten from some of the members of the XML-WG, it
is clear that we can't add BASE64 section to the spec.  As you pointed out,
BASE64 section is not helpful enough for XML applications.  Tim suggested
that we write up a proposal for the use of a reserved attribute or namespace
to signal, as a convention in XML 1.0, that the contents of an element are
base64 encoded.  Such a proposal would serve the need right now and could be
adopted by XML 1.1 in the future.

I would like to form a small team to write the proposal.  Since we are
dealing with a focused subject, I would like to fasttrack this proposal.

Let me get the ball rolling with following brief summary of the proposal:

1. Name

Names are important since they serve as mental hooks to hang knowledge.  The
choices I can think of are:

a)    XML-Binary
b)    XML-Blob
c)    Encoded XML Content

I would like to use a short easily understandable name like XML-Binary so
that vendors can say their product supports XML-Binary.

2. Mechanism

I tend to prefer the use of reserved attribute(s) than namespace.  I would
very much like to see something like xml:space attribute used.

For the kind of applications I am familiar with, adding following two
special attributes would be enough:

xml:encoding="base64"
xml:mimetype="image/gif"

Should we limit it to base64 and just have xml:encoded attribute with true
and default as possible values?

Should we be using some standard encoding standard names?  Frankly I am not
aware of any such standard (duh!).

Do we need xml:mimetype?  My application sure could use it since I can
fireup a content handler based on the mimetype and pass it the decoded data.
The content handler returns a component which is inserted into the tree to
display the content.

This should be enough get the discussion going.

Sincerely,

Don Park


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From M.H.Kay at eng.icl.co.uk  Mon Feb  9 10:14:52 1998
From: M.H.Kay at eng.icl.co.uk (Michael Kay)
Date: Mon Jun  7 17:00:07 2004
Subject: GEDCOM - A Killer XML Application?
Message-ID: <01bd3543$743bd0c0$1e09e391@mhklaptop.bra01.icl.co.uk>


>I ... have learned very quickly that there
>is a *huge* amount of genealogy stuff/activity on the Web.
>
>Most of it revolves around a genealogy file format called GEDCOM
>that apparantly originated with the Church of the Latter Day Saints.
>
Yes, I've done some work on this, and have been hoping to go public, but
it's come to a bit of a standstill while other activities mroe important to
my
employers have taken over.

I agree with you that am XML encoding of GEDCOM (let's call it GedML?)
offers great potential benefits:

- solving GEDCOM's problems with character sets and binary objects
- allowing "rich text" in the textual fields
- providing a mechanism for cross-file linkage
- making it much easier to write GEDCOM applications
- allowing GEDCOM data to be published directly on the web, rather than
  being reformatted for publication on the web
- allowing web search engines to index GEDCOM files intelligently

I've got as far as
- writing a few notes on the design principles / rationale
- writing GEDCOM to GedML converters in both directions
- working out in principle how to enhance these to do ANSEL to UNICODE
conversion
- writing a DTD for GedML
- writing an MSXML application that creates a (partial) Java representation
of the GEDCOM object model for use by applications.

Since I'm stalled, any cooperation will be much appreciated!

regards,
Mike Kay


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From alex.webb at staempfli.com  Mon Feb  9 10:42:11 1998
From: alex.webb at staempfli.com (Webb Alex)
Date: Mon Jun  7 17:00:07 2004
Subject: GEDCOM - A Killer XML Application?
Message-ID: <D75C396FE59FD111A1F40060B03C29C010E1@mailserver.allmedia.ch>

A very interesting brochure "The Gedcom Standard Release 5.5" is
available from

http://www.tiac.net/users/pmcbride/gedcom/55gctoc.htm

This details the philosophy and current (?) standard.

Does anyone have an alternative genealogy dtd ???

Alex Webb

Xml-dev: A list for W3C XML Developers. To post,
<mailto:xml-dev@ic.ac.uk>
Archived as: <http://www.lists.ic.ac.uk/hypermail/xml-dev/>
To (un)subscribe, <mailto:majordomo@ic.ac.uk> the following message;
(un)subscribe xml-dev
To subscribe to the digests, <mailto:majordomo@ic.ac.uk> the following
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (<mailto:rzepa@ic.ac.uk>)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mtbryan at sgml.u-net.com  Mon Feb  9 11:05:21 1998
From: mtbryan at sgml.u-net.com (Martin Bryan)
Date: Mon Jun  7 17:00:07 2004
Subject: GEDCOM - A Killer XML Application?
Message-ID: <01bd3544$bdb5d2e0$LocalHost@sgml.u-net.com>

Sean


>I have been wandering the Web searching for my Wife's relatives
>(surname Kilcawley. Know anyone?) and have learned very quickly that there
>is a *huge* amount of genealogy stuff/activity on the Web.
>
>Most of it revolves around a genealogy file format called GEDCOM
>that apparantly originated with the Church of the Latter Day Saints.
>
>This is a snippet of Gedcom:
>
> 1 NAME Archibald  /BARD_(Beard)/
> 1 SEX M
> 1 BIRT
> 2 DATE SEENOTES
> 2 PLAC Antrim,Ireland
> 1 DEAT
> 2 DATE    FEB 1765
>
>Sure looks like a cool XML application to me!

Defining an XML DTD for it is easy, but what is really interesting is how
you could use the data already out there in this format within XML
applications without having to recode it all.

Unfortunately the XML-Data proposal does not seem to provide sufficient
tools for mapping the existing schema to an XML equivalent without invoking
a specialist script. It would be nice if there were some generalized
mechanisms for doing this.

> I mean,the whole
>point of these GEDCOM files is publishing/interchange of
>genealogy data. Richly structured hierarchies. Oodles of
>scope to show of spiffy XLL linking, spiffy XSL rendering,
>intelligent search agents. The whole nine yards.
>
>Has anyone looked into this? If not, anyone interested
>in helping to get a ball rolling?

I am currently exploring how we could do a mapping between three file
formats, XML, CSV and GEDCOM, to provide an integrated set of resources for
tracing genealogical information through a HyTime-encoded Topic Navigation
Map. This goes a bit beyond what you are suggesting, but may be more
practical in the longer run.

Martin Bryan


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ht at cogsci.ed.ac.uk  Mon Feb  9 12:45:25 1998
From: ht at cogsci.ed.ac.uk (Henry S. Thompson)
Date: Mon Jun  7 17:00:07 2004
Subject: XML-Data Questions
In-Reply-To: "Don Park"'s message of Fri, 30 Jan 1998 16:11:53 -0800
References: <000001bd2ddc$ec7641b0$2ee044c6@donpark>
Message-ID: <f5bd8gx7xrm.fsf@cogsci.ed.ac.uk>

"Don Park" <donpark@quake.net> wrote on 30 Jan (sorry for late reply):

> I have some questions about the XML-Data spec which affects implementation:
> 
> 1. How are the schemas referenced from XML documents?

Not clear.  Given the NON-official status of XML-Data, a PI seems the
most likely route for now.

> 2. How does one validate XML documents which use XML-Data schema rather than
> DTD?

One doesn't :-)  See previous discussion on this list about validation
-- 'valid' is predicate over document instances and doctypes AS
SPECIFIED IN THE XML SPECIFICATION.  The following extract from my
SGML97 paper (cf. http://www.ltg.ed.ac.uk/~ht/B9H.html) is relevant:

"In our approach, we envisage

 a) the schema DTD, a definition of an XML representation of document
 structure, that is, an old-style DTD for schemata;

 b) a master XML application, the equivalent of the XML parser, which
 is capable of processing pairs of XML documents, where the first, a
 schema, is valid in terms of the schema DTD; the second, an instance,
 has no old-style DTD, but is both well-formed in the XML sense and
 meta-valid in terms of the schema expressed by the first.

 Meta-validity is, of course, [conformance] to the document
 structure constraints contained in the associated schema, which
 [itself is valid per] the schema DTD."

> 3. Current XML-Data does not allow or rather make it easy for enumerated
> attribute values to contain spaces becuase space is used as delimeters.
> 
> Why not use the following structure to define enumerated attribute values?
> 
> <elementType id="Book">
>   <attribute name="ageGrp" type="ENUMERATION">
>     <value>children</value>
>     <value>adult</value>
>     <default>adult</default>
>   </attribute>
> </elementType>

Um, the Enumeration declared value for attributes must consist of
Nmtokens (production 59 in the Proposed Recommendation) so the issue
doesn't arise.  Support for enumerated notation values isn't in
XML-Data yet (if I remember right) but the same constraint obtains there.

Hope this helps.

ht
-- 
Henry S. Thompson, Human Communication Research Centre, University of Edinburgh
      2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440
               Fax: (44) 131 650-4587, e-mail: ht@cogsci.ed.ac.uk  
                      URL: http://www.cogsci.ed.ac.uk/~ht/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From gmckenzi at JetForm.com  Mon Feb  9 14:18:56 1998
From: gmckenzi at JetForm.com (Gavin McKenzie)
Date: Mon Jun  7 17:00:07 2004
Subject: Encoded XML Content -- was Re: Open Trading Protocol (CDATA, etc)(was BASE64 section support)
Message-ID: <c=CA%a=_%p=JetForm%l=ROSSINI-980209141400Z-1156@rossini.jetform.com>


Don,

Here's my suggestions...

Methinks xml:encoding is too close to the XML PI encoding for character
set encodings. I wish that the XML PI encoding had been called
text-encoding or char-encoding -- this would have made it easier to come
up with other 'encoding' attributes without ambiguity. *sigh*

How about:

1. xml:transfer-encoding
2. xml:content-encoding

Suggestion #1 is a little ugly because it has the word 'transfer', but
this is closer to the MIME heritage where base64 is primarily used for
packaging style encoding, as opposed to locale char-set encoding. 

Suggestion #2 may seem redundant but at least doesn't conflict directly
with 'encoding' in the context of locale char-set encoding.  

As for the mimetype attribute...I'd vote for something closer to IOTP,
such as:

 xml:content-format

where content-format can be one of: 
- a mimetype that indentifies the content format, e.g. "image/jpeg"
- a user-defined code of the form "x-ddd:nnn", where ddd is a domain and
nnn is an arbitrary name for the format e.g. "x-jetform:mdf"  

However, IOTP includes other acceptable values for content-format such
as 'PCDATA' and 'XML'.  I view this as duplication and believe that only
the two options above are necessary; i.e. XML content should be able to
be expressed as 'text/xml', ignoring the fact that this isn't a *real*
mimetype.

And I assume that the implication in all of this that somebody could
include content that contains well-formed and valid xml that happens to
be base64'd?  Hence it is neccessary for the parser to unwrap such
sections, right?


Thoughts?

Gavin.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From serres-doug at usa.net  Mon Feb  9 15:22:57 1998
From: serres-doug at usa.net (Doug Serres)
Date: Mon Jun  7 17:00:07 2004
Subject: GEDCOM - A Killer XML Application?
References: <199802081444.OAA27250@mail.iol.ie>
Message-ID: <34DF1F10.846364EF@usa.net>


Sean Mc Grath wrote:

> I have been wandering the Web searching for my Wife's relatives
> (surname Kilcawley. Know anyone?) and have learned very quickly that there
> is a *huge* amount of genealogy stuff/activity on the Web.
>
> Most of it revolves around a genealogy file format called GEDCOM
> that apparantly originated with the Church of the Latter Day Saints.

The Church of Jesus Christ of Latter-day Saints
(http://www.lds.org/Family_History/How_Do_I_Begin.html)

> Has anyone looked into this? If not, anyone interested
> in helping to get a ball rolling?

I'd be interested in this one too!

--Doug


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From deke at tallent.com  Mon Feb  9 15:54:42 1998
From: deke at tallent.com (Deke Smith)
Date: Mon Jun  7 17:00:07 2004
Subject: GEDCOM - A Killer XML Application?
Message-ID: <1325104436-519141701@tallent.com>

Martin Bryan, mtbryan@sgml.u-net.com said on 2/9/98 4:23 AM:

>Defining an XML DTD for it is easy, but what is really interesting is how
>you could use the data already out there in this format within XML
>applications without having to recode it all.
>
>Unfortunately the XML-Data proposal does not seem to provide sufficient
>tools for mapping the existing schema to an XML equivalent without invoking
>a specialist script. It would be nice if there were some generalized
>mechanisms for doing this.

There has existed a de-facto standard for conversion of GEDCOM to HTML 
for a couple of years. Information about it can be found at: 
<http://206.139.152.113/GenMatch/genmatch.htm>.

I have used GED2HTML (http://www.gendex.com/ged2html/) and it works VERY 
well, even on large databases. 

The code is in Perl and gets you half-way there on an XML conversion.

Deke

-----------------------------------------------------------------
Deke Smith
Tallent Communications Group, Brentwood TN
deke@tallent.com, 615-661-9878
-----------------------------------------------------------------


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From deke at tallent.com  Mon Feb  9 15:57:11 1998
From: deke at tallent.com (Deke Smith)
Date: Mon Jun  7 17:00:07 2004
Subject: Last minute request for BASE64 section support in XML 1.0
Message-ID: <1325104294-519150211@tallent.com>

Don Park, donpark@quake.net said on 2/6/98 8:26 PM:

>It looks like XML is about to be approved as standard by W3C.  Could we
>please have BASE64 sections as a part of XML standard 1.0?  Everyone who
>support this idea, please reply to this message (short replies please to
>avoid LISTRIVIA).

Supported. If not officially accepted it WILL be used anyhow.

Deke

-----------------------------------------------------------------
Deke Smith
Tallent Communications Group, Brentwood TN
deke@tallent.com, 615-661-9878
-----------------------------------------------------------------


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From crism at ora.com  Mon Feb  9 17:08:34 1998
From: crism at ora.com (Chris Maden)
Date: Mon Jun  7 17:00:07 2004
Subject: Last minute request for BASE64 section support in XML 1.0
In-Reply-To: <1325104294-519150211@tallent.com> (message from Deke Smith on
	Mon, 9 Feb 98 09:56:39 -0600)
Message-ID: <199802091712.MAA05614@geode.ora.com>

[Deke Smith]
> Supported. If not officially accepted it WILL be used anyhow.

Then you will be using something other than XML, and good luck getting
any application to accept it.  The string '<![' can only be followed
by 'CDATA' in a document instance, and there are no ifs, ands, or buts
about it.

The use of notation names for marked sections has been proposed by
others, and may be included in a future revision of XML if the WG
concurs.  But right now, you will have to use a notation attribute.

-Chris
-- 
<!NOTATION SGML.Geek PUBLIC "-//Anonymous//NOTATION SGML Geek//EN">
<!ENTITY crism PUBLIC "-//O'Reilly//NONSGML Christopher R. Maden//EN"
"<URL>http://www.oreilly.com/people/staff/crism/ <TEL>+1.617.499.7487
<USMAIL>90 Sherman Street, Cambridge, MA 02140 USA" NDATA SGML.Geek>

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From pazandak at OBJS.com  Mon Feb  9 18:41:50 1998
From: pazandak at OBJS.com (Paul Pazandak)
Date: Mon Jun  7 17:00:07 2004
Subject: Type-specific class generation using XML parsers
Message-ID: <34DF4EBB.7B0EAC1B@OBJS.com>

I have finished modifications to an XML parser to support type-specific
class tree generation, as opposed to generic tree objects. This means,
for example, that the parser would generate a complex object of the
form:

BOOK (using book.java)
 - CHAPTER (using chap.java)
    - SECTION (using sect.java)
etc.

for an xml document describing a book. The resulting tree is useable
immediately without further parsing or traversing of the tree, which
would be generally required if the tree was composed of generic XML
objects. The class specifications are embedded in the accompanying
DTD (which are then consumed by the parser), but could as easily be
embedded in the xml document itself.

My question is what, if any, effort is there to standardize how
class-related metadata is defined within a DTD or XML specification?
I'd prefer to adopt an approach that is likely to be standardized.
In addition, what other approaches (excluding hard-coding classnames)
have been proposed to produce the same result as I have described?

Regards,

Paul.

p.s. This all came about because event-based parsing seems like quite
a pain. In addition, any changes to the XML structure can require many
changes to the event-handling code. Further, the generation of generic
tree structures is not very useful because one must traverse the tree
and basically parse (again!) the tree to generate application-specific
structures. So, why not have the correct structure be generated the
first time by the parser?

--

********************************************************************
Paul Pazandak, Ph.D                                pazandak@objs.com
Object Services and Consulting, Inc.             http://www.objs.com
Minneapolis, Minnesota 55420-5409                       612-881-6498
********************************************************************


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From kent at trl.ibm.co.jp  Tue Feb 10 02:37:15 1998
From: kent at trl.ibm.co.jp (TAMURA Kent)
Date: Mon Jun  7 17:00:07 2004
Subject: IBM `XML for Java' has released.
Message-ID: <9802100236.AA46457@ns.trl.ibm.com>


  `XML for Java' is a validating XML processor written in Java.

  You can download from IBM alphaWorks:
	http://www.alphaworks.ibm.com/formula/xml
  It requires Java 1.1.

-- 
TAMURA Kent @ Tokyo Research Laboratory, IBM Japan


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From zwang at pstat.ucsb.edu  Tue Feb 10 07:58:50 1998
From: zwang at pstat.ucsb.edu (Zheng Wang)
Date: Mon Jun  7 17:00:07 2004
Subject: IBM `XML for Java' has released.
In-Reply-To: <9802100236.AA46457@ns.trl.ibm.com>
Message-ID: <Pine.GSO.3.95.980209234946.11652B-100000@fisher>

Thanks Tamura,
It does works for jdk1.1.5 or even jdk1.1.3. Since you did not give
the complete source code, I can not figure out the reason that it does not
work for jdk1.2beta2.


Zheng Wang
Department of Statistics and Applied Probability 
University of California, Santa Barbara
E-mail: zwang@pstat.ucsb.edu; http://www.pstat.ucsb.edu/~zwang


On Tue, 10 Feb 1998, TAMURA Kent wrote:

> 
>   `XML for Java' is a validating XML processor written in Java.
> 
>   You can download from IBM alphaWorks:
> 	http://www.alphaworks.ibm.com/formula/xml
>   It requires Java 1.1.
> 
> -- 
> TAMURA Kent @ Tokyo Research Laboratory, IBM Japan
> 
> 
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
> 
> 


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at quake.net  Tue Feb 10 09:51:13 1998
From: donpark at quake.net (Don Park)
Date: Mon Jun  7 17:00:07 2004
Subject: Online SAXDOM Demo Available
Message-ID: <000201bd3608$c0c7f4d0$2ee044c6@donpark>

I have just uploaded a browser based (currently limited to Internet Explorer
4.0) demo of SAXDOM being used from JavaScript.  Although the demo is
somewhat sluggish due to Java/JavaScript synchronization problems, it shows
DOM being used by a scripting language just as it was designed for.
Exciting!

You can find the demo at:

http://www.quake.net/~donpark/SaxDomDemo/SaxDomDemo.html

Have fun,

Don Park
http://www.quake.net/~donpark/index.html


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cecile.baille-pierre at bull.net  Tue Feb 10 15:31:36 1998
From: cecile.baille-pierre at bull.net (BAILLE-PIERRE C�cile)
Date: Mon Jun  7 17:00:07 2004
Subject: Object Hierarchie with XML
Message-ID: <01BD3640.DFB57560@belledonne.frcl.bull.fr>

As I'm just begin looking at XML specifications , my question will be perhaps a nonsense (In this case I promise this question will be the first and last one!).
As far as I understand, XML document has a tree-like structure which is perfect to reflect composition /aggregation entities ("my book is composed of : a title, an author, one po more paragraphs, etc ..). where child elements represent parts of the element currently defined.
But how simply implement a class hierarchy, i.e "element E is derived from super-Element S and inherit attributes and properties"?

C?cile.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jjc at jclark.com  Tue Feb 10 15:32:18 1998
From: jjc at jclark.com (James Clark)
Date: Mon Jun  7 17:00:07 2004
Subject: XML resources updated
Message-ID: <34E07251.7D36673C@jclark.com>

I've updated my XML parsers and test suite to match the final XML
recommendation.  See http://www.jclark.com/xml for more information. The
biggest change is that I've enhanced my XML implementation in C to
include a general purpose, non-validating XML parser layered on top of
the tokenizer.

James

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From dgd at cs.bu.edu  Tue Feb 10 16:12:15 1998
From: dgd at cs.bu.edu (David G. Durand)
Date: Mon Jun  7 17:00:07 2004
Subject: Last minute request for BASE64 section support in XML 1.0
Message-ID: <v03007801b106228a3851@[205.181.196.4]>

	From: Deke Smith <deke@tallent.com>

	Don Park, donpark@quake.net said on 2/6/98 8:26 PM:

	>It looks like XML is about to be approved as standard by W3C.
Could we
	>please have BASE64 sections as a part of XML standard 1.0?
Everyone who
	>support this idea, please reply to this message (short replies
please to
	>avoid LISTRIVIA).

	Supported. If not officially accepted it WILL be used anyhow.

	Deke

This is silly. The specific proposal (a BASE64 marked section) _can't
be_ added at this point under the rules of the W3C. It's also unlikely
to fly in XML 1.1 for two reasons (which are more substantial
technical problems with the proposal as it stands):

  1. The proposed syntax is not compatible with SGML syntax, and can't
be made compatible without changes in SGML (violating the goals of the
XML project).

  2. The effect desired can be easily obtained in XML by the use of
NOTATION.

 For example:

<some-binary-element><![BASE64[ ..base64data..]]></some-binary-element>

could be replaced by (in the instance):
<some-binary-element>..base64data..</some-binary-element>

for a WF-checking application, the following DTD would be required:

<!attlist some-binary data-format NOTATION #FIXED "BASE64">

For validation, you'd have to declare the notation (by adding this to
the DTD or the internal subset):

<!NOTATION BASE64 "some URI for BASE64 encoding, determined by convention">

I may have made some detail mistakes, because I can't get to the
standard right now, but the basic point is that to handle base64
encoding (or any other encoding expressible in the XML character set)
you need only declare and attach a notation attribute.

If you don't like notation, you can even just use an attribute value
and keyword and skip the notation declaration. I don't remember the
character repertoire of BASE64, but the fact that it's email safe
means that the escaping issues are certainly no harer than those for
any XML text content.

If you really want to avoid escaping characters, you can use
references to external unparsed entities to avoid the problem altogether.


For the above reasons I expect that it _won't_ be used anyhow,
except by people who don't mind their documents being rejected by
conforming parsers. Given the presence of a simple way to do this
_inside_ XML, the need is unlikely to be regarded as being so
critical that conformance is irrelevant.

  -- David
------------------------------------------+----------------------------
David Durand                 dgd@cs.bu.edu| david@dynamicDiagrams.com
Boston University Computer Science        | Dynamic Diagrams
http://www.cs.bu.edu/students/grads/dgd/  | http://dynamicDiagrams.com/
                                          | MAPA: mapping for the WWW


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From wilfr at mail.bc.rogers.wave.ca  Tue Feb 10 18:14:53 1998
From: wilfr at mail.bc.rogers.wave.ca (Wilf Reedijk)
Date: Mon Jun  7 17:00:07 2004
Subject: IBM `XML for Java' has released.
References: <9802100236.AA46457@ns.trl.ibm.com>
Message-ID: <34E09932.EA59655E@rogers.wave.ca>

I just downloaded xml4j from the IBM site.
I tried to compile the trlx application but it seems that I am missing some
classes: org.xml.sax.EntityHandler etc. My classpath points to xml4j.jar. I don't
see these classes in there or anywhere else in the files that I downloaded. Am I
missing something?

Wilf Reedijk


TAMURA Kent wrote:

>   `XML for Java' is a validating XML processor written in Java.
>
>   You can download from IBM alphaWorks:
>         http://www.alphaworks.ibm.com/formula/xml
>   It requires Java 1.1.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From zwang at pstat.ucsb.edu  Tue Feb 10 19:08:00 1998
From: zwang at pstat.ucsb.edu (Zheng Wang)
Date: Mon Jun  7 17:00:07 2004
Subject: IBM `XML for Java' has released.
In-Reply-To: <34E09932.EA59655E@rogers.wave.ca>
Message-ID: <Pine.GSO.3.95.980210105925.13591A-100000@fisher>

That's exactly what I mentioned in my previous mail. The source code ibm
released is not complete. The users have to depend on the xml4j.jar file.
So users can not look at the source code.

Zheng Wang
Department of Statistics and Applied Probability 
University of California, Santa Barbara
E-mail: zwang@pstat.ucsb.edu; http://www.pstat.ucsb.edu/~zwang


On Tue, 10 Feb 1998, Wilf Reedijk wrote:

> I just downloaded xml4j from the IBM site.
> I tried to compile the trlx application but it seems that I am missing some
> classes: org.xml.sax.EntityHandler etc. My classpath points to xml4j.jar. I don't
> see these classes in there or anywhere else in the files that I downloaded. Am I
> missing something?
> 
> Wilf Reedijk
> 
> 
> 
> TAMURA Kent wrote:
> 
> >   `XML for Java' is a validating XML processor written in Java.
> >
> >   You can download from IBM alphaWorks:
> >         http://www.alphaworks.ibm.com/formula/xml
> >   It requires Java 1.1.
> 
> 
> 
> 
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
> 
> 


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at quake.net  Tue Feb 10 20:50:29 1998
From: donpark at quake.net (Don Park)
Date: Mon Jun  7 17:00:07 2004
Subject: Last minute request for BASE64 section support in XML 1.0
Message-ID: <001c01bd3664$cd9c1630$2ee044c6@donpark>

David,

>This is silly. The specific proposal (a BASE64 marked section) _can't
>be_ added at this point under the rules of the W3C. It's also unlikely
>to fly in XML 1.1 for two reasons (which are more substantial
>technical problems with the proposal as it stands):


Form and timing of the proposal might be silly, the need is not.

>  1. The proposed syntax is not compatible with SGML syntax, and can't
>be made compatible without changes in SGML (violating the goals of the
>XML project).


Agreed.  I am not a SGML whiz and I count on folks like you to point out
problems.

>  2. The effect desired can be easily obtained in XML by the use of
>NOTATION.


Notation declarations have no use for non-validating applications.  IMHO,
most applications will validate only during design time and never during
runtime.  Unless some means independent of DTD must be used to indicate that
content is encoded form of some binary data.

>If you don't like notation, you can even just use an attribute value
>and keyword and skip the notation declaration. I don't remember the
>character repertoire of BASE64, but the fact that it's email safe
>means that the escaping issues are certainly no harer than those for
>any XML text content.


I am not really concerned about how binary data is encoded in individual XML
format.  I am concerned about the lack of support in the standard.  As Tim
Bray suggests, I am trying to put in place a recommended convention for
embedding encoded data so we can all readily store and retrieve binary data.

Currently, I am proposing to add two reserved attributes

xml:content-encoding="base64;second-encoding-layer;third-encoding-layer"
xml:content-type="mime/type"

Multiple names in the encoding attribute might be going overboard but I am
just thinking ahead of multilayer encoding.  Such scheme could be used to
embedded compressed XML document within another XML document.  Should the
compressed XML document be expanded inplace and fed into the parser?  Hmm.
Looks like there will be two levels to the proposal.

Your mention of notation brings up a possible need of xml:content-notation
attribute which could be used by other elements to reference the binary
data.  Since referenced embeded data must be defined before the first
reference, placements becomes rather restricting especially if the embedded
data element is not significant at the point of definition (where icon is
stored inside an XML file is not important but where it is referenced is).

I appreciates your comments.

Regards,

Don Park
http://www.quake.net/~donpark/index.html


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From crism at ora.com  Tue Feb 10 21:26:19 1998
From: crism at ora.com (Chris Maden)
Date: Mon Jun  7 17:00:07 2004
Subject: Last minute request for BASE64 section support in XML 1.0
In-Reply-To: <001c01bd3664$cd9c1630$2ee044c6@donpark>
Message-ID: <199802102130.QAA19592@geode.ora.com>

[Don Park]
> Notation declarations have no use for non-validating applications.
> IMHO, most applications will validate only during design time and
> never during runtime.  Unless some means independent of DTD must be
> used to indicate that content is encoded form of some binary data.

The notation mechanism is provided for exactly this purpose.  I'm not
sure why it's unacceptable to you, but I don't think that developing a
secondary means of providing the same information is preferable.

I'm not very thrilled with the way notation works, but given Dan
Connolly's comments about moving MIME towards a URL-based mechanism,
then MIME types can be used as notation system identifiers.

You can not expect to process XML documents in total ignorance of the
DTD.  You can expect to process many XML documents with only the
internal subset, and you can mandate for your application that
notation declarations be in the internal subset.  I don't see why

<!DOCTYPE foo [
<!NOTATION base64 ...>
...
]>
<binary-data notation="base64">...</binary-data>

is unacceptable, but

<!DOCTYPE foo [
...
]>
<binary-data><![BASE64[...]]></binary-data>

is acceptable.  The first even provides for a measure of extensibility
(!) that the second lacks.

This discussion should probably be moved to the XML SIG, as it
involves the design of XML, not its implementation.

-Chris
-- 
<!NOTATION SGML.Geek PUBLIC "-//Anonymous//NOTATION SGML Geek//EN">
<!ENTITY crism PUBLIC "-//O'Reilly//NONSGML Christopher R. Maden//EN"
"<URL>http://www.oreilly.com/people/staff/crism/ <TEL>+1.617.499.7487
<USMAIL>90 Sherman Street, Cambridge, MA 02140 USA" NDATA SGML.Geek>

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at quake.net  Tue Feb 10 22:09:06 1998
From: donpark at quake.net (Don Park)
Date: Mon Jun  7 17:00:07 2004
Subject: Last minute request for BASE64 section support in XML 1.0
Message-ID: <001001bd366f$d29b1310$2ee044c6@donpark>

Chris,

>I'm not very thrilled with the way notation works, but given Dan
>Connolly's comments about moving MIME towards a URL-based mechanism,
>then MIME types can be used as notation system identifiers.

That still leaves encoding format to be specified.  While I have focused on
BASE64, I would prefer to leave the door open for other encoding formats.

>You can not expect to process XML documents in total ignorance of the
>DTD.  You can expect to process many XML documents with only the
>internal subset, and you can mandate for your application that
>notation declarations be in the internal subset.  I don't see why
>
><!DOCTYPE foo [
><!NOTATION base64 ...>
>...
>]>
><binary-data notation="base64">...</binary-data>

I was not aware that non-validating XML parsers are required to process the
internal DTD subset.  Is this true?  Even if it was true, how could an
application tell that notation="base64" attribute indicates that the content
is binary data?  Should we treat "base64" as a special notation name?

><!DOCTYPE foo [
>...
>]>
><binary-data><![BASE64[...]]></binary-data>

Perhaps I did not make it clear.  I have already gave up on the idea of
using BASE64 section after realizing that it will conflict with SGML.

Please read my description of my latest proposal in my last message post.
It does look similar to your "notation='base64'" idea without requiring the
use of notation.  It allows a non-validating parser to detect whether an
element's content is binary data and, if so, determine its encoding format
and its MIME type.  A very friendly parser could take that information and
return an object which could be an image, sound, or even a Java object if
the data is Java serialization data.

What I just described is already working in my application.  I simply pass
the info to Java Activation Framework (JAF) to get mimetype specific handler
for the decoded data.  I am hoping to provide some of the code as reference
implementation for the upcoming XML-Binary proposal.

Regards,

Don Park
http://www.quake.net/~donpark/index.html

-----Original Message-----
From: Chris Maden <crism@ora.com>
To: xml-dev@ic.ac.uk <xml-dev@ic.ac.uk>
Date: Tuesday, February 10, 1998 1:28 PM
Subject: Re: Last minute request for BASE64 section support in XML 1.0


>[Don Park]
>
>is unacceptable, but
>
>
>is acceptable.  The first even provides for a measure of extensibility
>(!) that the second lacks.
>
>This discussion should probably be moved to the XML SIG, as it
>involves the design of XML, not its implementation.
>
>-Chris
>--
><!NOTATION SGML.Geek PUBLIC "-//Anonymous//NOTATION SGML Geek//EN">
><!ENTITY crism PUBLIC "-//O'Reilly//NONSGML Christopher R. Maden//EN"
>"<URL>http://www.oreilly.com/people/staff/crism/ <TEL>+1.617.499.7487
><USMAIL>90 Sherman Street, Cambridge, MA 02140 USA" NDATA SGML.Geek>
>
>xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
>Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
>To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
>(un)subscribe xml-dev
>To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
>subscribe xml-dev-digest
>List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From dcarlson at ontogenics.com  Tue Feb 10 23:08:43 1998
From: dcarlson at ontogenics.com (Dave Carlson)
Date: Mon Jun  7 17:00:08 2004
Subject: IBM `XML for Java' has released.
Message-ID: <2.2.32.19980210230321.00f87ca0@pop.dimensional.com>

I have not looked at the IBM package, but the org.xml.sax.EntityHandler
class is in the SAX distribution.  See:
        http://www.microstar.com/XML/SAX/

I assume that IBM implemented a SAX driver for their parser.  Sounds good!

Dave

At 10:15 AM 2/10/98 -0800, you wrote:
>I just downloaded xml4j from the IBM site.
>I tried to compile the trlx application but it seems that I am missing some
>classes: org.xml.sax.EntityHandler etc. My classpath points to xml4j.jar. I
don't
>see these classes in there or anywhere else in the files that I downloaded.
Am I
>missing something?
>
>Wilf Reedijk
>
>
>
>TAMURA Kent wrote:
>
>>   `XML for Java' is a validating XML processor written in Java.
>>
>>   You can download from IBM alphaWorks:
>>         http://www.alphaworks.ibm.com/formula/xml
>>   It requires Java 1.1.
>
>
>
>
>xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
>Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
>To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
>(un)subscribe xml-dev
>To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
>subscribe xml-dev-digest
>List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From norbert at datachannel.com  Tue Feb 10 23:36:37 1998
From: norbert at datachannel.com (Norbert Mikula)
Date: Mon Jun  7 17:00:08 2004
Subject: DXP - DataChannel XML Parser 1.0 Beta available
Message-ID: <066401bd367c$94a3a880$830a1bac@norbert.datachannel.com>

Skipped content of type multipart/alternative-------------- next part --------------
A non-text attachment was scrubbed...
Name: Norbert H. Mikula.vcf
Type: text/x-vcard
Size: 492 bytes
Desc: not available
Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19980210/93cd749b/NorbertH.Mikula.vcf
From peter at ursus.demon.co.uk  Wed Feb 11 00:52:08 1998
From: peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun  7 17:00:08 2004
Subject: XML as W3C Recommendation
Message-ID: <3.0.1.16.19980211004432.224717fc@pop3.demon.co.uk>

I am sure that most of you know that the W3C announced today that XML 1.0
was a Recommendation. The details are at:
http://www.w3.org/XML

This is a milestone in a very exciting quest, and there are many people and
organisations who deserve credit. In my experience it is one of the best
decision-making processes I have been acquainted with. Note that there are
many issues still actively under consideration. It is important that
XML-DEV members are aware that there are active working groups on these
issues - these are listed on the W3 site. They include further developments
in XML itself, XLL, XSL, namespaces, RDF, etc.

I know it is frustrating for those 'not in the club', but much of the
current formal discussion is confidential. The various WGs release
information here as soon as it is reasonable. We have to accept, therefore,
that it is not useful to discuss possible revisions of the drafts in this
forum. The members of the SIG and the WGs have agreed to tight communal
procedures, which at times require saying nothing :-) and it will help if
we do the same.

Please, therefore, accept the Recommendations and drafts as they are
published and try to work with them. By all means report *implementation*
problems and concerns here, but be assured that all 'vibes' will get back
to the various groups. The biggest contributions will come from showing how
the spec can be used to solve problems in practice.

	P.

Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic
net connection
VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary
http://www.venus.co.uk/vhg

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From zwang at pstat.ucsb.edu  Wed Feb 11 02:02:00 1998
From: zwang at pstat.ucsb.edu (Zheng Wang)
Date: Mon Jun  7 17:00:08 2004
Subject: IBM `XML for Java' has released.
In-Reply-To: <9802100236.AA46457@ns.trl.ibm.com>
Message-ID: <Pine.GSO.3.95.980210175021.16850C-100000@fisher>


Hi, Tamura,
When I ran the parser with jdk 1.1.3, it gave me the following error
message:


java trlx -d personal.xml 
java.lang.InternalError: Converter malfunction(UTF8) -- please send a bug report to java-io@java.sun.com         
	at java.io.InputStreamReader.malfunction(InputStreamReader.java:119) 
        at java.io.InputStreamReader.convertInto(InputStreamReader.java:133) 
        at java.io.InputStreamReader.fill(InputStreamReader.java:177)  On
        at java.io.InputStreamReader.read(InputStreamReader.java:235)  > >
        at java.io.BufferedReader.fill(BufferedReader.java:144) 
        at java.io.BufferedReader.read(BufferedReader.java:161) 
        at com.ibm.xml.parser.XMLReader.read(XMLReader.java:292) 
        at com.ibm.xml.parser.FileReading.getChar(FileReading.java:29) 
        at com.ibm.xml.parser.Token.getChar(Token.java:34) 
        at com.ibm.xml.parser.Parser.readStream(Parser.java:419) 
        at com.ibm.xml.parser.trlx.main(trlx.java:143) 
        at trlx.main(trlx.java:19)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From smith at interlog.com  Wed Feb 11 08:00:40 1998
From: smith at interlog.com (Chris Smith)
Date: Mon Jun  7 17:00:08 2004
Subject: Encoded XML Content
In-Reply-To: <c=CA%a=_%p=JetForm%l=ROSSINI-980209141400Z-1156@rossini.jetform.com>
Message-ID: <Pine.BSI.3.95.980210104720.3190A-100000@shell1.interlog.com>


The discussion has covered some good points up to now. I'll try to
build on it, and move forward.

Let's be clear about what we're trying to solve here. Unicode has
essentially solved the text problem. This note focuses on non-textual
data, or places where a different character encoding is required
inside your document.

For some applications, base64 will be be easy to use. Binary data will
be present in particular locations in the XML tree, and the
applications will simply know to decode it. These don't really need
anything new, but will benefit if there is a common technique for
handling it. 

I think the real target is 'container' elements, where the designer
needs to allow for flexibility in content at runtime. It is possible
to do part of this with elements, but you run into two difficulties.
First, you eventually hit your non-text data, and you have to provide
some indication of the content and format. Second, you may have a real
need to allow for formats that have never been forseen.

What we don't need to do is provide another mechanism for managing XML
markup and structure. XML parsers will not be asked to do anything
different. This is entirely about how developers will use XML's
features to resolve an often-encountered problem. (That's why this
still belongs on xml-dev.)

That said, the moment you move away from Unicode data content, you
face a number of issues. You will probably have to specify a wrapper
layer used to make the data XML-friendly. If that is removed, then you
will have to note what format or conventions apply to the next layer.
Ultimately you will reach either a text layer or a binary data layer,
which cannot be further unwrapped. That layer may need a descriptor,
to specify what type of data was carried with all this effort.

The question I still haven't completely resolved is - is there a need
for allowing an arbitrary number of layers, or is three sufficient?
That is the 'content encoding', 'content format', and 'content type'?
I'm not certain it's sufficient, but I can't see a use for much more
at the moment. (I'm not tightly attached to the labels, but I think
they work, and at least they're a start.) The most likely
implementations seem to be with these as attributes. Attributes that
are not present would have a default of a zero-length string.

Below, I've listed a number of items, in the interests of ensuring
that any proposed solution can handle them all. (Ultimately, such a
table would be useful to developers.)

What Is It?      Content   Content          Content
                 Encoding  Format           Type
--------------   --------  ---------------  -----------------
JPEG image       base64                     mime:image/jpeg
ASCII text       base64    ISO-8859-1       mime:text/plain
HTML text        base64    ISO-8859-1       mime:text/html
XML content    
XML carried                                 xml:                                        
XML carried      base64    ISO-10646-UCS-2  mime:text/xml
XML data only                               xml:pcdata
private data     hex                        x-private:somedata
private text     base64    Commodore64      x-private:sometext
embedded item    base64    ISO-8859-1       rfc:822
embedded item    base64                     mime:application/x-zip


I thought about separating content-type from the content-domain, but
I can't see that you would specify them separately all that often.

The above seems to support several required ideas:

1) Standard XML content requires no settings at all. This is the
   degenerate case, and it is good that it works this way.

2) Standard XML content could be structured using a DTD specified 
   using namespace techniques. This appears to be an available option
   without changing any of the infrastructure around encoding.

3) It supports MIME types, but does not require them. Other domains
   can be used bsides MIME, including completely private or
   proprietary formats.

4) There is some consistency. Notice that whenever you specify a text
   type, you must provide a content-format. Otherwise, the text is the
   same as the surrounding XML. Whenever you specify any
   content-format that is different than the surrounding XML, you must
   use a content-encoding to restore XML friendliness.

4) So far, just about anything you can throw in there that has any
   current structure looks to be workable.

An example element using these, called 'container' could be defined as
shown below.

<!ELEMENT container ANY>
<!ATTLIST container
   content-encoding (base64|hex|none) "none"
   content-format   CDATA
   content-type     CDATA   
>

I've limited the strings in content-encoding. Is this a good idea?

There would be some structure applied to the content-format and
content-type, but I don't think it would be effectively captured in
the DTD.

Comments aren't just welcome - they're essential!
 
---------------------------------------------------------------------------
 Chris Smith                                          <smith@interlog.com>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at quake.net  Wed Feb 11 11:16:29 1998
From: donpark at quake.net (Don Park)
Date: Mon Jun  7 17:00:08 2004
Subject: Preliminary XML-Binary prototype demo
Message-ID: <000e01bd36dd$ce0924e0$2ee044c6@donpark>

I am a hands-on guy so I have already put together a version of XML-Binary
implementation in my own application which works pretty well although I am
starting to see some limitations as I see new ways of using it.

Since it worked so well, I thought you guys might want to see something
working as well so I changed the SAXDOM demo to handle XML-Binary elements.
Just go to the SAXDOM demo and parse the Binary.xml file to see an image
appear in midst of colorized XML document.  I am still having problems with
Navigator so use IE 4.0 if you got one.  If not, don't sweat it.  Just
examine to JavaScript code to see what is going on and then check out the
Binary.xml file to see how XML-Binary is expressed in XML.

The demo uses xml:content-encoding and xml:content-type to indicate the
encoding type and content  type in MIME.  There are lots of issues but I
really want to keep the initial version (level 1) out there rather quickly
to address to basic needs first.  I will post the list of issues real soon
now.

I do appeciate the comments regarding XML-Binary but I sure could use more,
particularly from the XML-WG members and application developers in need of
embedded binary data.  I realize that XML-Binary activity is somewhat on the
border of design and implementation domain but since I am not on the XML-SIG
mailing list, I have no choice but to grill the shrimps on the sidewalk.

Yummy, that smells good!;-)

Don Park
http://www.quake.net/~donpark/index.html

PS: demo is at http://www.quake.net/~donpark/SaxDomDemo/SaxDomDemo.html
PS: Robin, this is NOT an announcement! <g>.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From grove at infotek.no  Wed Feb 11 18:02:05 1998
From: grove at infotek.no (Geir Ove Gronmo)
Date: Mon Jun  7 17:00:08 2004
Subject: SAX: Empty elements
Message-ID: <3.0.2.32.19980211185910.009da370@jenufa.infotek.no>


While working on implementations using SAX I've noticed that there is no
way to know if an element is an empty element or not (e.g <Para/>). This
could perhaps be done using some kind of lookahead, but should that be
necessary?

Perhaps a change to the startElement method in the DocumentHandler
interface could fix this.

This is how the method is defined in the Draft Specification (1998-01-12):

public void startElement (String name, AttributeMap attributes) 
throws Exception 

Perhaps this should be something like:

public void startElement (String name, AttributeMap attributes, boolean
isempty)
throws Exception 

Best regards,
Geir O.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Jon.Bosak at eng.Sun.COM  Wed Feb 11 19:44:57 1998
From: Jon.Bosak at eng.Sun.COM (Jon Bosak)
Date: Mon Jun  7 17:00:08 2004
Subject: Call for presentations: XML Dev Day 3/27
Message-ID: <199802111942.LAA23645@boethius.eng.sun.com>

CALL FOR PRESENTATIONS: XML DEVELOPERS' DAY 1998.03.27

A one-day technical conference for XML developers will be held Friday,
March 27, in Seattle, Washington.  The event constitutes the last day
of the GCA XML Conference (http://www.gca.org/conf/xmlcon98/).

XML Developers' Day is a single-track event devoted entirely to
technical reports on the latest developments in XML implementation.
If you are engaged in the construction of any software that works with
XML -- converters, parsers, servers, browsers, editors, or XML-based
vertical applications -- here is your chance to share your work with
an audience that can understand and appreciate it.

Since stylesheet-based rendering is part of XML publishing, developers
of tools that support XSL or DSSSL are invited to show their latest
offerings as well.  We're also open to presentations on XML-based
languages (CML, OFX, etc.)  and related efforts that might have a
significant impact on the future of XML (RDF, XML-Data, etc.) if they
are of particular interest to XML developers.

Vendors of commercial tools can participate, but they must confine
their presentations to the technical aspects of current XML products
in development.  Table space will be made available for the
distribution of product announcements and commercial literature.

REGISTRATION

The registration fee for XML Developers' Day is $275 for GCA members
and $390 for non-GCA members (see the registration page below for
conference and tutorial rates).  This is mighty inexpensive for an
inside update on the very latest activity in this field.  You can
register at

   http://www.gca.org/conf/xmlcon98/registra.htm

N.B.: Presenters get in free.

CALL FOR PRESENTATIONS

If you would like to give a report at this event, send a paragraph or
two describing your presentation, based on a conservative estimate of
the status of your project as it will stand on March 27, to Jon Bosak
(bosak@eng.sun.com).  Also include a description of the audio-visual
equipment you will need for your presentation and an estimate of its
duration.  Please include the phrase "XML Dev Day" somewhere in the
subject line of your message.

Since we want up-to-the-minute reports on activities in progress,
there will be no published proceedings, and therefore you need not
submit your entire presentation in advance.  But please try to make
your forecasted description as accurate as possible so that we can
choose the most interesting and relevant submissions.

The deadline for submissions is Friday, February 27.

Jon

----------------------------------------------------------------------
 Jon Bosak, Online Information Technology Architect, Sun Microsystems
    901 San Antonio Road, MPK17-101, Palo Alto, California 94043
----------------------------------------------------------------------
   If a man look sharply and attentively, he shall see Fortune; for
   though she be blind, yet she is not invisible.  -- Francis Bacon
----------------------------------------------------------------------

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at quake.net  Wed Feb 11 20:09:52 1998
From: donpark at quake.net (Don Park)
Date: Mon Jun  7 17:00:08 2004
Subject: Empty elements
Message-ID: <003d01bd3728$549089a0$2ee044c6@donpark>

Geir,

>While working on implementations using SAX I've noticed that there is no
>way to know if an element is an empty element or not (e.g <Para/>). This
>could perhaps be done using some kind of lookahead, but should that be
>necessary?

Are you unable to process the element in endElement() callback?  Typical
DocumentHandler implementation must keep track of current element so all you
have to inside endElement() is check to see the current element has no
children and no attributes.

Perhaps your have a different need but it seems like an implementation
strategy issue.

Best regards,

Don Park
http://www.quake.net/~donpark/index.html


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From andrewl at microsoft.com  Thu Feb 12 00:55:52 1998
From: andrewl at microsoft.com (Andrew Layman)
Date: Mon Jun  7 17:00:08 2004
Subject: Object Hierarchie with XML
Message-ID: <5BF896CAFE8DD111812400805F1991F7DCC998@red-msg-08.dns.microsoft.com>

A type hierarchy would use a vocabulary (schema) designed for that purpose.
Such vocabularies are not presently part of XML per se, though you can find
type-hierarchy concepts discussed in several papers, such as those at the
W3C RDF site and in a paper that I co-authored,
http://www.w3.org/TR/1998/NOTE-XML-data-0105/Overview.html.

> -----Original Message-----
> From:	BAILLE-PIERRE C�cile [SMTP:cecile.baille-pierre@bull.net]
> Sent:	Tuesday, February 10, 1998 7:28 AM
> To:	Mailing Liste XML-DEV/Messages (Adresse de messagerie)
> Cc:	C�cile Baille-Pierre (Adresse de messagerie)
> Subject:	Object Hierarchie with XML
> 
> As I'm just begin looking at XML specifications , my question will be
> perhaps a nonsense (In this case I promise this question will be the first
> and last one!).
> As far as I understand, XML document has a tree-like structure which is
> perfect to reflect composition /aggregation entities ("my book is composed
> of : a title, an author, one po more paragraphs, etc ..). where child
> elements represent parts of the element currently defined.
> But how simply implement a class hierarchy, i.e "element E is derived from
> super-Element S and inherit attributes and properties"?
> 
> C�cile.
> 
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
> message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jjaakkol at cs.Helsinki.FI  Thu Feb 12 13:06:10 1998
From: jjaakkol at cs.Helsinki.FI (Jani Jaakkola)
Date: Mon Jun  7 17:00:08 2004
Subject: Empty elements
In-Reply-To: <003d01bd3728$549089a0$2ee044c6@donpark>
Message-ID: <Pine.LNX.3.96.980212145142.22601A-100000@sipulipaasi.cs.Helsinki.FI>


On Wed, 11 Feb 1998, Don Park wrote:

> Geir,
> 
> >While working on implementations using SAX I've noticed that there is no
> >way to know if an element is an empty element or not (e.g <Para/>). This
> >could perhaps be done using some kind of lookahead, but should that be
> >necessary?
> 
> Are you unable to process the element in endElement() callback?  Typical
> DocumentHandler implementation must keep track of current element so all you
> have to inside endElement() is check to see the current element has no
> children and no attributes.

Yes, but in SGML and XML element type which has been declared empty
in the DTD and therefore is marked with <Para/> tag in XML
is different thing from element which just happens to be
empty (e.g <para></para>).

In SP:s generic interface StartElementEvent events have
ContentType property which can have value empty. Without this
property it would be impossible to produce valid SGML-instance
using the event stream because element types which are declared
empty are marked up differently from element types which just
happen to be empty sometimes.

I'd say that an parser API which does not provide information
about empty declared elements is broken and should be fixed.
(i haven't looked at SAX:s API though)

- Jani


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cecile.baille-pierre at bull.net  Thu Feb 12 13:46:25 1998
From: cecile.baille-pierre at bull.net (BAILLE-PIERRE C�cile)
Date: Mon Jun  7 17:00:08 2004
Subject: No subject
Message-ID: <01BD37C4.792CD1A0@belledonne.frcl.bull.fr>

HELP!!!
I desperately search some Web site : clear, dicdactic, complete about XML syntax, something which will be more understanding than directly:
[70] 	EntityDecl	::= 	GEDecl | PEDecl		
[71] 	GEDecl	::= 	'<!ENTITY' S Name S EntityDef S? '>'		
[72] 	PEDecl	::= 	'<!ENTITY' S '%' S Name S PEDef S? '>'		
[73] 	EntityDef	::= 	EntityValue | (ExternalID NDataDecl?)		
[74] 	PEDef	::= 	EntityValue | ExternalID		

http://www.w3.org/TR/1998/REC-xml-19980210 doesn't give enough examples (from my point of view). I've already found very interesting XML sites, which describe some XML implementations, some others give an overview, but -like XML FAQ- are too general.  I would like to find a sort of "Reader Digest", which will explain and illustrate point by point XML terminoloy: parameter/general entities, notations, Attributee lists .. and so on.
Book's reference will be welcome too.

Thanks.

Cecile.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cecile.baille-pierre at bull.net  Thu Feb 12 13:56:46 1998
From: cecile.baille-pierre at bull.net (BAILLE-PIERRE C�cile)
Date: Mon Jun  7 17:00:08 2004
Subject: No subject
Message-ID: <01BD37C5.E564DEC0@belledonne.frcl.bull.fr>

In my previous message, I said that
http://www.w3.org/TR/1998/REC-xml-19980210 doesn't give enough examples 

From k_coffin at conknet.com  Thu Feb 12 14:37:37 1998
From: k_coffin at conknet.com (Kerry Coffin)
Date: Mon Jun  7 17:00:08 2004
Subject: 
Message-ID: <01bd37c3$9563ad90$f00620ce@lbynum.esri.com>

I'd like this same help.
Thanks
Kerry Coffin

-----Original Message-----
From: BAILLE-PIERRE C?cile <cecile.baille-pierre@bull.net>
To: Mailing Liste XML-DEV/Messages (Adresse de messagerie)
<xml-dev@ic.ac.uk>
Cc: C?cile Baille-Pierre (Adresse de messagerie)
<cecile.baille-pierre@bull.net>
Date: Thursday, February 12, 1998 9:04 AM


HELP!!!
I desperately search some Web site : clear, dicdactic, complete about XML
syntax, something which will be more understanding than directly:
[70] EntityDecl ::= GEDecl | PEDecl
[71] GEDecl ::= '<!ENTITY' S Name S EntityDef S? '>'
[72] PEDecl ::= '<!ENTITY' S '%' S Name S PEDef S? '>'
[73] EntityDef ::= EntityValue | (ExternalID NDataDecl?)
[74] PEDef ::= EntityValue | ExternalID

http://www.w3.org/TR/1998/REC-xml-19980210 doesn't give enough examples
(from my point of view). I've already found very interesting XML sites,
which describe some XML implementations, some others give an overview,
but -like XML FAQ- are too general.  I would like to find a sort of "Reader
Digest", which will explain and illustrate point by point XML terminoloy:
parameter/general entities, notations, Attributee lists .. and so on.
Book's reference will be welcome too.

Thanks.

Cecile.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ak117 at freenet.carleton.ca  Thu Feb 12 14:38:40 1998
From: ak117 at freenet.carleton.ca (David Megginson)
Date: Mon Jun  7 17:00:08 2004
Subject: Empty elements
In-Reply-To: <Pine.LNX.3.96.980212145142.22601A-100000@sipulipaasi.cs.Helsinki.FI>
References: <003d01bd3728$549089a0$2ee044c6@donpark>
	<Pine.LNX.3.96.980212145142.22601A-100000@sipulipaasi.cs.Helsinki.FI>
Message-ID: <199802121436.GAA00313@unready.microstar.com>

Jani Jaakkola writes:

 > Yes, but in SGML and XML element type which has been declared empty
 > in the DTD and therefore is marked with <Para/> tag in XML
 > is different thing from element which just happens to be
 > empty (e.g <para></para>).

Actually, that's not generally the case in XML.  Here's what the REC
says (Section 3.1 "Start-Tags, End-Tags, and Empty-Element Tags"):

   Empty-element tags may be used for any element which has no content,
   whether or not it is declared using the keyword EMPTY. For
   interoperability, the empty-element tag must be used, and can only be
   used, for elements which are declared EMPTY.

   Examples of empty elements:

   <IMG align="left"
    src="http://www.w3.org/Icons/WWW/w3c_home" />
   <br></br>
   <br/>

Here's the definition of "for interoperability":

   for interoperability
          A non-binding recommendation included to increase the chances
          that XML documents can be processed by the existing installed
          base of SGML processors which predate the WebSGML Adaptations
          Annex to ISO 8879.

In other words, XML processors may (and should) treat

  <br></br>

and

  <br/>

as equivalent, but document authors might want to make the distinction
so that pre-WebSGML SGML parsers can handle their documents.

That begs the question of the processor's information set, however --
a processor designed for use with repositories or with editors, for
example, needs to preserve lexical as well as structural information
about the XML document, such as comments, general entity references
(even within attribute values), specified vs. defaulted attribute
values, CDATA sections, whitespace within tags, etc.

SAX as it currently stands is not designed to preserve most lexical
information; in the future, we may devise a SAX level-2 to return this
information, but since most applications that need it will probably
use a DOM anyway, the demand may not be strong enough.


All the best,


David

-- 
David Megginson                 ak117@freenet.carleton.ca
Microstar Software Ltd.         dmeggins@microstar.com
      http://home.sprynet.com/sprynet/dmeggins/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jjaakkol at cs.Helsinki.FI  Thu Feb 12 15:28:53 1998
From: jjaakkol at cs.Helsinki.FI (Jani Jaakkola)
Date: Mon Jun  7 17:00:08 2004
Subject: Empty elements
In-Reply-To: <199802121436.GAA00313@unready.microstar.com>
Message-ID: <Pine.LNX.3.96.980212170739.22743A-100000@sipulipaasi.cs.Helsinki.FI>


On Thu, 12 Feb 1998, David Megginson wrote:

> In other words, XML processors may (and should) treat
> 
>   <br></br>
> 
> and
> 
>   <br/>
> 
> as equivalent, but document authors might want to make the distinction
> so that pre-WebSGML SGML parsers can handle their documents.

Ah. Pardon me my ignorance. Different syntax for empty elements
in XML or SGML was a nuisance anyway, so this seems to be a one more thing
fixed.

<CLIP>
 
> SAX as it currently stands is not designed to preserve most lexical
> information; in the future, we may devise a SAX level-2 to return this
> information, but since most applications that need it will probably
> use a DOM anyway, the demand may not be strong enough.

If i understood this correctly, SAX is also not designed for
interoperatibility. If you want to generate pre-WebSGML from
XML using SAX (and accept that lexical information is not preserved), you
still would need the ability to detect empty declared elements.

- Jani


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From srn at techno.com  Thu Feb 12 16:15:28 1998
From: srn at techno.com (Steven R. Newcomb)
Date: Mon Jun  7 17:00:08 2004
Subject: Object Hierarchie with XML
In-Reply-To: 
	<5BF896CAFE8DD111812400805F1991F7DCC998@red-msg-08.dns.microsoft.com>
	(message from Andrew Layman on Wed, 11 Feb 1998 16:55:40 -0800)
Message-ID: <199802121510.KAA00885@bruno.techno.com>

[C�cile Baille-Pierre (cecile.baille-pierre@bull.net):]

> > But how simply implement a class hierarchy, i.e "element E is
> > derived from super-Element S and inherit attributes and properties"?

[Andrew Layman (andrewl@microsoft.com):]

> A type hierarchy would use a vocabulary (schema) designed for that
> purpose.  Such vocabularies are not presently part of XML per se,
> though you can find type-hierarchy concepts discussed in several
> papers, such as those at the W3C RDF site and in a paper that I
> co-authored,
> http://www.w3.org/TR/1998/NOTE-XML-data-0105/Overview.html.

In fact, this capability is already available to XML users, by virtue
of the fact that the derivation of object types from one another is
provided by ISO/IEC 10744:1997 for SGML in general, and this standard
has been amended specifically to allow XML's use of these concepts by
means of an XML-legal PI-based declaration syntax.  There is literally
nothing to prevent the adoption and use of this facility by anyone,
regardless of whether W3C chooses to acknowledge that this
internationally standardized facility exists.  The idea of object type
inheritance is far too useful for XML users to ignore it forever.  As
the ISO 10744 "enabling architectures" facility demonstrates, it is
not necessary to create a special DTD syntax or a special kind of
schema to support hierarchies of element type inheritance.  What is
needed is a way to inherit the semantics and structure of any element
types of any DTDs (schemas), regardless of whether they were intended
to be inherited.  That kind of functionality (among others) is
supported by this facility.

There is a pointer to the relevant standard at http://www.hytime.org.
When you get there, look in the table of contents for Annex A.  A.3
("Architectural Form Definition Requirements [AFDR]") is where the
"enabling architectures" facility is described.

-Steve

--
Steven R. Newcomb, President, TechnoTeacher, Inc.
srn@techno.com  http://www.techno.com  ftp.techno.com

voice: +1 972 231 4098 (at ISOGEN: +1 214 953 0004 x137)
fax    +1 972 994 0087 (at ISOGEN: +1 214 953 3152)

3615 Tanner Lane
Richardson, Texas 75082-2618 USA

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ak117 at freenet.carleton.ca  Thu Feb 12 16:20:58 1998
From: ak117 at freenet.carleton.ca (David Megginson)
Date: Mon Jun  7 17:00:08 2004
Subject: Empty elements
In-Reply-To: <Pine.LNX.3.96.980212170739.22743A-100000@sipulipaasi.cs.Helsinki.FI>
References: <199802121436.GAA00313@unready.microstar.com>
	<Pine.LNX.3.96.980212170739.22743A-100000@sipulipaasi.cs.Helsinki.FI>
Message-ID: <199802121620.IAA00662@unready.microstar.com>

Jani Jaakkola writes:

 > If i understood this correctly, SAX is also not designed for
 > interoperatibility. If you want to generate pre-WebSGML from XML
 > using SAX (and accept that lexical information is not preserved),
 > you still would need the ability to detect empty declared elements.

SAX is an XML processing interface rather than an authoring interface,
so interoperability is not exactly an applicable concept (though I do
understand what you mean).  That said, some XML tools that use SAX
also have their own interfaces that can provide you with DTD
information -- for one example, see AElfred at

  http://www.microstar.com/XML/


All the best,


David

-- 
David Megginson                 ak117@freenet.carleton.ca
Microstar Software Ltd.         dmeggins@microstar.com
      http://home.sprynet.com/sprynet/dmeggins/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From grk at arlut.utexas.edu  Thu Feb 12 20:37:59 1998
From: grk at arlut.utexas.edu (Glenn R. Kronschnabl)
Date: Mon Jun  7 17:00:08 2004
Subject: SAX/DOM IDL -> C++ Mapping / Confused
Message-ID: <199802122037.OAA06856@mail-firewall.arlut.utexas.edu>

Hi,

I was trying to duplicate the SAX/DOM java stuff in C++ (and interface with 
SP).  Now, I am a IDL newbie, but according the DOM spec, Node is defined to 
be just an interface.  In the SAXDOM code that's straightforward since java 
understands interfaces.  HOWEVER, in C++, according to the IDL -> C++ mapping, 
an interface is supposed to be constructed as a ABSTRACT base class using pure 
virtual functions.  The problem is when I try to enumerate over them, I get a 
'can't cast up from a virtual base class' error.  Here is abbreviated source.  
Obviously, I am making a fundamental mistake.  Can some kind person out there 
clue me in?  Thanks.

grk$ g++ n.cc
n.cc: In function `int main()':
n.cc:82: cannot cast up from virtual baseclass `Node'

----- cut here ---
#include <list>
#include <string>

class NodeList;
class NodeEnumerator;

class Node {
  enum NodeType {DOCUMENT, ELEMENT};
public:
  virtual NodeType getNodeType() = 0;
  virtual Node* getParentNode() = 0;
  virtual NodeList* getChildren() = 0;
};

class Element : public virtual Node {
 public:
  virtual string getTagName() = 0;
  virtual NodeEnumerator* getElementsByTagName() = 0;
 };

class SaxNode : public virtual Node {
  public:
   NodeType type;
   Node* parent;
   NodeList* children;

  virtual NodeType getNodeType() { return type; }
  virtual Node* getParentNode() { return parent; }
  virtual NodeList* getChildren() { return children; }
 };

class SaxElement : public virtual Node, public Element, public SaxNode {
 public:
  string tagName;
  virtual NodeEnumerator* getElementsByTagName() { }
  virtual string getTagName() { return string("SaxElement"); }
 };

class NodeList {
 public:
  virtual NodeEnumerator* getEnumerator() = 0;
 };

class SaxNodeEnumerator;

class SaxNodeList: public list<SaxNode*>, public NodeList {
 public:
  virtual NodeEnumerator* getEnumerator() { }
};

class NodeEnumerator {
 public:
  virtual Node* getFirst() = 0;
};

class SaxNodeEnumerator : public NodeEnumerator {
 public:
  Node* getFirst() { }
  
};

main()
{

  SaxElement se;

  SaxNodeList* list = (SaxNodeList*) se.getChildren();

  SaxNodeList::iterator snode = list->begin();

  for (; snode != list->end(); ++snode)
  {
    (*snode)->getNodeType(); 

    SaxElement* elem = (SaxElement*) (*snode);

    elem->getTagName();

    SaxNodeEnumerator* e2 = (SaxNodeEnumerator*) elem->getElementsByTagName();

    SaxNode* s2 = (SaxNode*) e2->getFirst();

//    SaxNode snode = (SaxNode*) (*node);
//    cout << node->getNodeType() << endl;
  }

}
--- cut here ----

Cheers,
Glenn                                  
--------------------
Glenn R. Kronschnabl
Applied Research Laboratories        | grk@arlut.utexas.edu (PGP/MIME ok)
The University of Texas at Austin    | http://www.arlut.utexas.edu/~grk
PO Box 8029, Austin, TX 78713-8029   | (Ph) 512.835.3642 (FAX) 512.835.3808
10,000 Burnet Road, Austin, TX 78758 | ... but an Aggie at heart!


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From fasthand at bigfoot.com  Thu Feb 12 22:05:30 1998
From: fasthand at bigfoot.com (fasthand@bigfoot.com)
Date: Mon Jun  7 17:00:08 2004
Subject: ANN: ezDTD 1.1   DTD editor/Generator/Formatter
In-Reply-To: <199802111942.LAA23645@boethius.eng.sun.com>
Message-ID: <199802122202.QAA18814@cotton.vislab.olemiss.edu>

ezDTD v1.1      DTD Editor/Generator/Formatter
----------------------------------------------------------------

  FOA, please forgive me if you receive this mail more than once. I
  stated ezDTD a month ago. Some of you have tried it and gave me very
  value suggestions. The latest ezDTD is v1.1. You can fint it at  

  http://www.geocities.com/SiliconValley/Haven/2638/ezDTD.htm


o Why create ezDTD?

  ezDTD, as a handy tool, it can help

  1. Quickly jumping from one element to another.
  2. Complete the typing by filling something like ANY,
     EMPTY, #IMPLIED .. etc.
  3. Export a HTML-format DTD file which has internal links
     among elements. Since this version ezDTD can import
     existing DTD, you can use it to create HTML-format
     document for existig DTD as well.

o What's new?

  Version 1.1 (1998-02-12)
 - Modify some interface.
 - You can import a DTD file. As long as it does not have too
          complex comment structure.
 - Support Start Tag and End Tag definition. 
 - Export DTD in either SGML or XML fashion (with or without
          the minization)
 - Correct the including example file appraisal.edz which did not
          explain itself clear enough.

o Download

  Please check out 
  http://www.geocities.com/SiliconValley/Haven/2638/ezDTD.htm

Thanks for your time

Duncan Chen
fasthand@bigfoot.com

___________________________________

Duncan Chen                       
fasthand@bigfoot.com             
FNC, Inc.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From hcheung at parc.xerox.com  Thu Feb 12 23:50:42 1998
From: hcheung at parc.xerox.com (Harry Cheung)
Date: Mon Jun  7 17:00:08 2004
Subject: Microsoft's XML parser...
Message-ID: <01BD37CD.E9A0E120.hcheung@parc.xerox.com>

I'm using Microsoft's Java XML parser(1.8) to
generate xml, and I've run into a hitch.  In building a xml document, 
I construct a XML document using the object model, adding children
elements, etc.  Now, I need to grab a XMLOutputStream from it so 
that I may send it on a FileInputStream.  However, when I call the 
"save" method of Document, the XMLOutputStream returned doesn't 
deal with the namespaces and as a result, causes a parse failure when
I try to parse the generated file.  Here's a main section of the code:

ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
XMLOutputStream xmlstream = doc.createOutputStream(outputStream);
doc.save(xmlstream);
System.err.println("XMLDocument:\n" + new String(outputStream.toByteArray()));

doc is a instance of Document and the output that I print doesn't have the
namespaces substituted.  Now, am I going about this all wrong?
Am I missing something?

Harry Cheung
hcheung@parc.xerox.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From pierlou at CAM.ORG  Fri Feb 13 00:33:14 1998
From: pierlou at CAM.ORG (Pierre)
Date: Mon Jun  7 17:00:08 2004
Subject: ANN: Database and EcmaScript support
Message-ID: <01bd3816$1b3228f0$02dcdcdc@pierre>

Prototype now support access to JDBC database and scripting with EcmaScript interpreter.
Load a JFC table or tree with simple XML declaration.

Look at the Database page and the Scripting page.
http://www.cam.org/~pierlou/prototype


Thanks

Pierre Morel 
pierlou@cam.org 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19980213/74f43e02/attachment.htm
From dgd at cs.bu.edu  Fri Feb 13 19:16:52 1998
From: dgd at cs.bu.edu (David G. Durand)
Date: Mon Jun  7 17:00:08 2004
Subject: Empty elements
In-Reply-To: <199802121436.GAA00313@unready.microstar.com>
References: 
 <Pine.LNX.3.96.980212145142.22601A-100000@sipulipaasi.cs.Helsinki.FI>
 <003d01bd3728$549089a0$2ee044c6@donpark>
 <Pine.LNX.3.96.980212145142.22601A-100000@sipulipaasi.cs.Helsinki.FI>
Message-ID: <v03007804b10a46ab564d@[128.148.37.19]>

>  [snip]
>In other words, XML processors may (and should) treat
>
>  <br></br>
>
>and
>
>  <br/>
>
>as equivalent, but document authors might want to make the distinction
>so that pre-WebSGML SGML parsers can handle their documents.

Some of us think this was a significant reduction in the power of XML to
represent useful information (the difference between an element that marks
a point phenomenon, and one that is empty because it just doesn't have any
content).

>That begs the question of the processor's information set, however --
>a processor designed for use with repositories or with editors, for
>example, needs to preserve lexical as well as structural information
>about the XML document, such as comments, general entity references
>(even within attribute values), specified vs. defaulted attribute
>values, CDATA sections, whitespace within tags, etc.

It's not possible to write _valid_ SGML document instances without this
information, something not true of comments, DTD, info, or the other
lexical information.

I think the EMPTY declaration status, and the lexical form of the element
occurrence are useful for that practical reason alone.

>SAX as it currently stands is not designed to preserve most lexical
>information; in the future, we may devise a SAX level-2 to return this
>information, but since most applications that need it will probably
>use a DOM anyway, the demand may not be strong enough.

This information is more than purely lexical, which is why it should be in
there...

  -- David

_________________________________________
David Durand              dgd@cs.bu.edu  \  david@dynamicDiagrams.com
Boston University Computer Science        \  Sr. Analyst
http://www.cs.bu.edu/students/grads/dgd/   \  Dynamic Diagrams
--------------------------------------------\  http://www.dynamicDiagrams.com/
MAPA: mapping for the WWW                    \__________________________


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From lex at www.copsol.com  Fri Feb 13 23:07:56 1998
From: lex at www.copsol.com (Alex Milowski)
Date: Mon Jun  7 17:00:08 2004
Subject: ANNOUNCE: DAE SDK Beta 2 Released
Message-ID: <199802132304.RAA25660@copsol.com>


PRESS RELEASE:

Copernican Solutions Releases the beta 2 of the DAE SDK and DAE Server 
Software.

NEW IN THIS RELEASE

* A new XML 1.0 Well-formed processor.

* Faster style application once the style is loaded.

* Faster SDQL execution.

* An update to the Scheme environment supporting JIT compilers and
  Java 1.1 readers.

* Updates for using the DAE SDK with a JIT compiler.

* Element ID attribute support for SGML.

* All demos are now XML-based (see the JSPI for SGML demos).

* A new package (COM.copsol.tools.html) was added for writing HTML groves.

* Updates for XMLWriter to support writing valid XML.


PRODUCT DESCRIPTION:

DAE (Document Application Environment) is a Java-based SDK for processing
XML documents.  The foundations of this SDK is the DSSSL Developer's Toolkit 
developed at Copernican Solutions.  This toolkit is based on a componentized 
design allowing different technology components to be substituted in the DAE 
environment without affecting the other components.

The DAE currently supports: DSSSL SDQL, DSSSL Style Language, DSSSL Groves,
XML processing, and Scheme or Java Programming.

In addition, the DAE has been integrated into a Java-based web server product
called the DAE Server.  This product allows development web-based DAE 
applications.

An add-on component called the JSPI (Java SGML Parsing Interface) provides
parsing and grove generation for SGML documents using a native component.


LICENSING 

We strongly believe that DSSSL, SGML, and XML technology in Java is fundamental 
technology.  In light of this, we have restructured our licensing policy 
allows us to ensure that the right kind on technology--especially Java-based 
technology--is available for use and experimentation as well as for developing 
commercial products.

This policy also allows us to work with "Development Partners" ensuring our
technology or what results from working with these Development Partners is
available in a majority of web application environments.  Development Partners
benefit from a close development relationship, support, and immediate access to 
technology updates.


PRICE
  Non-Commercial Use         - Free
  Internal Commercial Use    - Free
  Commercial Re-distribution - Requires a Development Partner Agreement.


DOWNLOAD

The DAE SDK and DAE Server software are now available for download at:

   http://www.copsol.com/


CONTACT
  Copernican Solutions Incorporated
  http://www.copsol.com
  sales@copsol.com

==============================================================================
R. Alexander Milowski     http://www.copsol.com/   alex@copsol.com
Copernican Solutions Incorporated                  (612) 379 - 3608

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From peter at ursus.demon.co.uk  Sat Feb 14 14:06:21 1998
From: peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun  7 17:00:08 2004
Subject: JUMBO-PLAY et. al.
Message-ID: <3.0.1.16.19980214135027.63dfb77c@pop3.demon.co.uk>

I have prepared an *alpha* version of JUMBO-PLAY (for browsing, navigating
and transforming Shakespeare PLAYs conforming to Jon Bosak's PLAY.dtd) at:

http://www.nottingham.ac.uk/~pazpmr/jpl9802a.zip
and the latest version of JUMBO-CORE at:
http://www.nottingham.ac.uk/~pazpmr/jum9802.zip

Installation instructions are inside the can.

This is *not* a permanent URL; the release is alpha and I'd be grateful for
installation feedback in the first instance. [I am assuming that most
people who are interested have been able to *load* the latest version of
JUMBO-CORE since I have had little negative feedback.] 

JUMBO-PLAY is the first of a series of JUMBO-* extensions to JUMBO-CORE.
JUMBO-* extensions allow people to customise their own application round
JUMBO by subclassing JUMBO elements/classes. A more detailed account will
follow in the final release but in general:
	- JUMBO-FOO allows you to write per-element classes. An element FOO:BAR in
XML can have a class jumbo.foo.BAR.java
	- JUMBO-FOO maps elements onto Java using namespaces and schemas.
(JUMBO-PLAY transforms the original documents into a PLAY: namespace). Each
element *may*, but need not, be mapped to a Java class.
	- unmapped elements inherit 'reasonable' behaviour from JUMBO-CORE. Thus
an element with a single PCDATA element as content will display this as a
name-value pair. A element with element content will display this as a
tree. An element with mixed content will display this as a tagged or
untagged event stream. An element which contains little chunks of
whitespace will do interesting things.
	- mapped elements can be customised for display, data entry, and real-time
interaction limited only by your programming ability and imagination
	- in its simplest form the namespace schema maps elements onto classes,
but it may also customise the semantics of those elements through
additional information in the XML-based schema. Each element can have a XML
file customising its semantics. (JUMBO-PLAY does not sue this facility).
	- JUMBO-FOO allows a document to be broken up (through a SAX-based parser)
into entities. JUMBO-PLAY shows two examples of this.

** PLEASE NOTE THAT JON BOSAK ASKS THAT THE SHAKESPEARE DISTRIBUTION BE
KEPT INTACT, SO NO PLAY FILES ARE INCLUDED IN THE DISTRIBUTION. YOU WILL
NEED TO DOWNLOAD THE DISTRIBUTION YOUSELF AND RUN jumbo.play.SAXSplit ON IT
TO PRODUCE INPUT FOR JUMBO-PLAY. FILES TRANSFORMED BY JUMBO-PLAY SHOULD NOT
BE REDISTRIBUTED. **

Like everyone else I thank Jon for this resource. It's worth noting that
the markup in PLAY is so useful as it stands that there is little point in
using XML tools simply to re-render it :-). JUMBO-PLAY adds the ability to
run TEI-like queries and to write indexing and analysis code - I shall add
some amateurish attempts at the latter in later versions.

	P.

[Please note that you should be able to run JUMBO-CORE before moving to
JUMBO-PLAY. I intend to use this modular form of distribution since large
files often break on download].
	
Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic
net connection
VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary
http://www.venus.co.uk/vhg

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jjc at jclark.com  Sun Feb 15 04:09:20 1998
From: jjc at jclark.com (James Clark)
Date: Mon Jun  7 17:00:08 2004
Subject: New SP/Jade test release
Message-ID: <34E6699E.F79CEAF@jclark.com>

A new test release of SP and Jade is now available from:

  ftp://ftp.jclark.com/pub/test/jade.zip

Win32 binaries are available from:

  ftp://ftp.jclark.com/pub/test/jadew.zip

This is SP version 1.2.92 and Jade version 1.0.93.

In SP the main change since 1.2.91 is better support for XML based on
the final WebSGML Adaptations Annex.  There's documentation on this is
xml.htm. Also the SX application has been merged in.

In Jade the main change since 1.0.92 is in the FOT backend. The FOT file
is now well-formed XML. It has also been changed to make it closer to
the action part of an XSL style-sheet.  The hyperlinking information is
also represented in a more straightforward way.  The idea is to make it
practical both to have new backends that work from the FOT file and to
have other programs that generate an FOT file. (Eventually I would like
to make the existing backends be able to take input from an FOT file as
well as directly from Jade.)

Note that I'm discontinuing distributing non-Unicode Win32 binaries,
and  so the SP Win32 executables do have Unicode support but no longer a
"u" suffix.

James

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From trevort at za.ibm.com  Mon Feb 16 11:43:36 1998
From: trevort at za.ibm.com (Trevor Turton)
Date: Mon Jun  7 17:00:08 2004
Subject: DTD meta data for XML viewers
Message-ID: <5060200011313801000002L012*@MHS>

In an earlier note I spoke about the need to associate compose-time meta
data with DTDs to allow XML editors to assist with the process of document
composition.  Another major class of meta data that needs to be associated
with DTDs is information on how the associated XML may be rendered - most
usefully, by identifying programs which can perform the required rendering.
The classic browser renders HTML on computer screens, and also on the
printed page.  The same will be required of XML browsers, and some will
also render XML documents to voice for the visually impaired, to Braille
for the more profoundly impaired, and to other media as new needs and
technologies arise.  Hence we may need to associate a list of rendering
programs with any given DTD, covering the various media types supported

From trevort at za.ibm.com  Mon Feb 16 12:23:21 1998
From: trevort at za.ibm.com (Trevor Turton)
Date: Mon Jun  7 17:00:09 2004
Subject: DTD meta data for XML viewers
Message-ID: <5060200011317159000002L092*@MHS>

(this is a retransmit - the previous version was truncated)

In an earlier note I spoke about the need to associate compose-time
meta data with DTDs to allow XML editors to assist with the process of
document composition.  Another major class of meta data that needs to
be associated with DTDs is information on how the associated XML may be
rendered - most usefully, by identifying programs which can perform the
required rendering.  The classic browser renders HTML on computer
screens, and also on the printed page.  The same will be required of
XML browsers, and some will also render XML documents to voice for the
visually impaired, to braille for the more profoundly impaired, and to
other media as new needs and technologies arise.  Hence we may need to
associate a list of rendering programs with any given DTD, covering the
various media types supported.

Current browsers attempt to render "all" HTML tags, but HTML is a
moving target.  Browsers are already very large, and need to be
supplemented by plug-ins to handle various MIME types.  Once XML is
generally available, we must expect a proliferation of DTDs by various
parties for varied purposes.  While a single generic XML editor may do
a good job of checking the syntax of documents developed against these
DTDs, it cannot render them.  Realistically, creators of novel DTDs
will have to create code that renders them.  And once a useful body of
DTDs has been developed, authors of XML documents will want to use DTDs
from many different independent sources in a single document.

No browser manufacturer will be able to bundle rendering code for all
DTDs into a single product.  We need to define a standard way in which
browsers can obtain and use code to render DTDs that were not even
invented when the browser was created.  This code distribution
mechanism may be something between a plug-in, which requires manual
intervention to install and persists after use, and a Java applet,
which installs automatically and is discarded after use.  For the
purpose of this note I will call them "renderlets".  We need to arrive
at a standard way of associating renderlets with DTDs as meta data, so
that browsers that encounter a DTD for the first time can find and
obtain the code required to render the XML defined by the DTD.

Some vendors may wish to create platform-specific and even
browser-specific renderlets, giving rise to the need to associate a
list of different renderlets with a given DTD.  Most authors of DTDs
will want to implement only a single renderlet to save effort.  This
would have to be platform and browser independent, and Java is the
obvious choice.  We need a standard way for Java renderlets to
interface with the browser that invokes them.  And since XML entities
may be imbedded in other independently created XML entities, renderlets
must also implement the same standard interface when they invoke
imbedded renderlets.  This interface will have to be richer than the
current spartan <APPLET> tag, which makes an unconditional demand for
display space.  Any given XML document may in the future be rendered on
anything from a IMAX screen to a Dick Tracey style watchtop display.
Renderlets will have to share the space available.  The browser will
have to sum the space demanded by the renderlets it hosts and compute
an overall compression factor.  It will have to communicate the
compression factor back to the various renderlets so they can make an
informed decision about the level of detail that they display - if any.
The renderlet interface will need to include a specialised Java
LayoutManager to facilitate layout and space negotiation.

Given this approach, the browser itself can become a fairly small and
simple shell, with all XML elements implemented by downloadable
renderlets.  A cottage industry in renderlets may emerge, paralleling
the VBX and OCX industry that Visual Basic spawned.  Competing versions
of renderlets for commonly used DTDs will arise, and browser owners
will be able to shop around for the renderlets that best meet their
needs.

If there are concerns that fetching renderlets will generate excessive
network activity and make XML browsers too slow to use, browsers (and
http proxy servers) could be enhanced to allow priming of their cache
directories.  Pointers to local copies of often-used renderlets (and
DTDs for that matter) could be loaded into the browser's (proxy
server's) cache directory upon initialisation.

Trevor Turton

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From richard at cogsci.ed.ac.uk  Mon Feb 16 16:20:58 1998
From: richard at cogsci.ed.ac.uk (Richard Tobin)
Date: Mon Jun  7 17:00:09 2004
Subject: New version of RXP
Message-ID: <29876.199802161620@cockburn.cogsci.ed.ac.uk>

There is a new (still not for public consumption) version of RXP at

   ftp://ftp.cogsci.ed.ac.uk/pub/richard/rxp.tar.gz

RXP a non-validating XML parser in C, with support for UTF-8, UTF-16,
and ISO-8859-1 character encodings.

It now (when run in strict XML-checking mode "rxp -x") finds all the
well-formedness errors in James Clark's test suite, except those where
the error is in the content model of an element declaration (this is a
bug, and will eventually be fixed).

Please report bugs to me.

-- Richard

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Jon.Bosak at eng.Sun.COM  Mon Feb 16 19:32:11 1998
From: Jon.Bosak at eng.Sun.COM (Jon Bosak)
Date: Mon Jun  7 17:00:09 2004
Subject: Fwd: ANN: New XSL Mailing List
Message-ID: <199802161930.LAA25676@boethius.eng.sun.com>

 From: tgraham@mulberrytech.com (Tony Graham)
 Date: Mon, 16 Feb 1998 06:09:17 GMT
 Newsgroups: comp.text.sgml
 Subject: ANN: New XSL Mailing List

 Mulberry Technologies announces the availability of XSL-List, the open
 forum for discussion of XSL (Extensible Style Language).

 To subscribe to XSL-List, send mail to majordomo@mulberrytech.com with
 "subscribe xsl-list" as the body of your message.  For more
 information, see http://www.mulberrytech.com/xsl/xsl-list.

 XSL-List will host discussion of XSL itself, XSL applications and
 implementation, and XSL user questions.  XSL-List is open to everyone,
 users and developers, experts and novices alike.  There is no
 restriction to what may be posted on the XSL-List provided it is
 related to XSL.

 XSL-List is not a W3C mailing list nor is it affiliated with W3C or
 any other organization.  XSL-List has no official standing with any
 organization and XSL-List subscribers do not constitute a Special
 Interest Group. However, XSL-List was established with the
 encouragement of members of the W3C XSL Working Group, and members of
 the Working Group will be among the subscribers to the list.

 XSL-List is provided by Mulberry Technologies as a service to the
 XSL user community and the XSL standardization effort.

 Only subscribers can post to XSL-List, but since the goal is to
 increase the level of XSL knowledge, XSL-List is being archived on
 Mulberry's web site for everybody to view.  The topics being discussed
 on the XSL-List changes as new ideas arise or existing problems are
 dealt with, but the archive contains all of the ideas and solutions
 that have been discussed on the list.

 Regards,


 Tony Graham

 =======================================================================
 Tony Graham
 Mulberry Technologies, Inc.                         Phone: 301-315-9632
 17 West Jefferson Street, Suite 207                 Fax:   301-315-8285
 Rockville, MD USA 20850                 email: tgraham@mulberrytech.com
 =======================================================================

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From crism at ora.com  Tue Feb 17 15:20:00 1998
From: crism at ora.com (Chris Maden)
Date: Mon Jun  7 17:00:09 2004
Subject: DTD meta data for XML viewers
In-Reply-To: <5060200011317159000002L092*@MHS> (message from Trevor Turton on
	Mon, 16 Feb 1998 12:21:55 +0000)
Message-ID: <199802171524.KAA03175@geode.ora.com>

[Trevor Turton]
> Another major class of meta data that needs to be associated with
> DTDs is information on how the associated XML may be rendered - most
> usefully, by identifying programs which can perform the required
> rendering.

One word: Stylesheets.

One URL: <URL:http://www.w3.org/Style/XSL/>.

HTH; HAND.

-Chris
-- 
<!NOTATION SGML.Geek PUBLIC "-//Anonymous//NOTATION SGML Geek//EN">
<!ENTITY crism PUBLIC "-//O'Reilly//NONSGML Christopher R. Maden//EN"
"<URL>http://www.oreilly.com/people/staff/crism/ <TEL>+1.617.499.7487
<USMAIL>90 Sherman Street, Cambridge, MA 02140 USA" NDATA SGML.Geek>

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tony.stewart at rivcom.com  Tue Feb 17 15:28:58 1998
From: tony.stewart at rivcom.com (Tony Stewart)
Date: Mon Jun  7 17:00:09 2004
Subject: DTD meta data for XML viewers
Message-ID: <4955E202FE46D11195C500609712EB6B06AA2F@FLPS-NTSERVER1>

Trevor Turton wrote:

		>>Another major class of meta data that needs to
		be associated with DTDs is information on how the
associated XML may be
		rendered - most usefully, by identifying programs which
can perform the
		required rendering.  The classic browser renders HTML on
computer
		screens, and also on the printed page.  The same will be
required of
		XML browsers, and some will also render XML documents to
voice for the
		visually impaired, to braille for the more profoundly
impaired, and to
		other media as new needs and technologies arise.  Hence
we may need to
		associate a list of rendering programs with any given
DTD, covering the various media types supported.

		>>Once XML is generally available, we must expect a
proliferation of DTDs by various
		parties for varied purposes.  While a single generic XML
editor may do
		a good job of checking the syntax of documents developed
against these
		DTDs, it cannot render them.  Realistically, creators of
novel DTDs
		will have to create code that renders them.  And once a
useful body of
		DTDs has been developed, authors of XML documents will
want to use DTDs
from many different independent sources in a single document.

I think these issues are real, but belong in the domain of the XSL
discussion. Style, presentation and behavior are all aspects of the same
thing: the transformation of the data/information into another, usually
human-understandable form. (I say "usually" because you could use a
style sheet to transform the information into a different
computer-interpretable form without ever presenting it to a human
being.) XSL can and should provide the tools that allow us to specify
what transformation should be applied to our XML data, including taking
into account issues like user impairment, new media, etc. It is the
means by which we associate multiple possible presentations--and the
mechanisms for choosing between them in a given presentational
instance--with data encoded according to a single DTD or schema.

Having said that, yes, we will need to implement robust presentational
mechanisms in (preferably) thin clients, so all of these technical
issues need to be addressed. But let's make sure that XSL stays in the
loop.

Tony

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Tony Stewart
Director of Consulting, RivCom
"Publishing Structured Information"
New York, NY, USA and Swindon, UK
Direct:	+1 (212) 222-4332
Office:	+1 (212) 662-6800	
Fax:	+1 (212) 662-6900	
UK Tel:	+44 1793 790 802
UK Fax: 	+44 1793 790 812
Email:	tony.stewart@rivcom.com
Web:	www.rivcom.com
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From peter at ursus.demon.co.uk  Tue Feb 17 23:06:37 1998
From: peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun  7 17:00:09 2004
Subject: Automating Search Interfaces 
Message-ID: <3.0.1.16.19980217215940.0c6f0018@pop3.demon.co.uk>

Posted on behalf of John Petit

>
>------------------------------------------
>This is a question about how the search scenario will play out on the
>web once XML becomes widely implemented. I have not seen this
>articulated in any of the specifications or articles on the web thus
>far. In lieu of that, I have imagined how it might work. I would like
>some feedback. Am I way off base?  Naturally the answer will have a big
>impact on the design of search engines and other services that I am
>creating.
>
>As particular industries and special interests standardize on their
>respective DTDs, Internet search engines will have to allow users to
>search by specific elements contained in those documents. In the typical
>
>search scenario, a user would use one of the major search services such
>as AltaVista or Yahoo. Lets say the user wanted to search across real
>estate listings, and these listings all used the same DTD. It seems that
>
>independent search engines need to interpret the DTD for a class of
>documents and present a query interface based on that DTD. The question
>is: how is the search engine to interpret the DTD and build an
>intelligent interface based on that DTD? Simply listing every element in
>
>the DTD is one approach, but an ugly one. Many DTDs will contain
>numerous elements which would only clutter and confuse a search
>interface.
>
>One solution may be to use DTD attributes to cue the search engines.
>Perhaps a "LEVEL" attribute could cue the searchers to display
>interfaces to predefined levels. The example below shows that the
>"LEVEL" attribute means that the "numbeds" element should always appear
>in a search query, or at the top level or searches. Any elements that
>did not have this level 1 attribute would not be shown in the search
>interface. If the "LEVEL" attribute was not found in the DTD, the
>default would show all of the elements with search fields next to them.
>
><!ELEMENT numbeds (#PCDATA)>
><!ATTLIST numbeds
>    XML-SQLTYPE INTEGER #FIXED
>    SNAME CDATA #FIXED "Number of beds"
>    LEVEL CDATA #FIXED "1">
>
>Search engines, upon seeing the "LEVEL" attribute, would configure their
>
>interface to have an "Additional Elements" button that would show the
>next level of elements. This would have the effect of shielding the user
>
>from an overwhelming mass of searchable elements.  Perhaps these
>mechanisms are in place, but I just do not see them.
>
>Another useful attribute would describe the "shown name" for a
>particular element. Element tags may not have as descriptive a name as
>they should in the DTD itself. For example, having "numbeds" appear in
>the user search interface would not be very user friendly. A much more
>descriptive string would be "Number of beds."
>
>The "XML-SQLTYPE" attribute indicates that "numbeds" is an integer. This
>
>is a form of strong typing that was described at one time by Tim Bray. I
>
>also do not know the status of strong typing in XML, but strong typing
>would sure be useful in this situation. If a search engine knows that a
>field is going to be a number, then the engine can provide optional
>number manipulations. Such useful operations may be determining price
>ranges, or in this case, a range for the number of bedrooms. Otherwise,
>how will an independent search engine or agent know that a particular
>field can be ranged and mathematically manipulated?
>
>I certanly do not think that these attributes should be mandatory, but
>it seems that there should be an agreed upon method of DTD construction
>that would give clues to search engines. I am clearly not an expert in
>this area, but I have not seen a solution to this in the XML proposals
>published thus far. Does anyone have an answer for this?
>
>
>
>
>
Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic
net connection
VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary
http://www.venus.co.uk/vhg

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ak117 at freenet.carleton.ca  Wed Feb 18 01:02:06 1998
From: ak117 at freenet.carleton.ca (David Megginson)
Date: Mon Jun  7 17:00:09 2004
Subject: Automating Search Interfaces 
In-Reply-To: <3.0.1.16.19980217215940.0c6f0018@pop3.demon.co.uk>
References: <3.0.1.16.19980217215940.0c6f0018@pop3.demon.co.uk>
Message-ID: <199802180100.UAA00316@unready.microstar.com>

Peter Murray-Rust writes:

 > Posted on behalf of John Petit

 > >One solution may be to use DTD attributes to cue the search engines.
 > >Perhaps a "LEVEL" attribute could cue the searchers to display
 > >interfaces to predefined levels. The example below shows that the
 > >"LEVEL" attribute means that the "numbeds" element should always appear
 > >in a search query, or at the top level or searches. Any elements that
 > >did not have this level 1 attribute would not be shown in the search
 > >interface. If the "LEVEL" attribute was not found in the DTD, the
 > >default would show all of the elements with search fields next to them.
 > >
 > ><!ELEMENT numbeds (#PCDATA)>
 > ><!ATTLIST numbeds
 > >    XML-SQLTYPE INTEGER #FIXED
 > >    SNAME CDATA #FIXED "Number of beds"
 > >    LEVEL CDATA #FIXED "1">

You could generalise this idea so that, instead of giving the level,
you gave the element type name in a different (real or hypothetical
DTD).  In other words,

  <!ELEMENT numbeds (#PCDATA)>
  <!ATTLIST numbeds
    general-doc NMTOKEN #FIXED "div1">

In other words, you're saying that the 'numbeds' element corresponds
to 'div1' (first-level division) in the other DTD.  This is more
useful, because you can express richer relationships than simply the
level.  For example, you could specify that 'expletive-deleted' is a
type of emphasised phrase and that 'city' is a type of name:

  <!ELEMENT expletive-deleted (#PCDATA)>
  <!ATTLIST expletive-deleted
    general-doc NMTOKEN #FIXED "emphasis">

  <!ELEMENT city (#PCDATA)>
  <!ATTLIST city
    general-doc NMTOKEN #FIXED "name">

That way, a user could search for any type of emphasised phrase or
proper noun, no matter what the element type was named. 


All the best,


David

-- 
David Megginson                 ak117@freenet.carleton.ca
Microstar Software Ltd.         dmeggins@microstar.com
      http://home.sprynet.com/sprynet/dmeggins/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sudar at pspl.co.in  Wed Feb 18 04:43:27 1998
From: sudar at pspl.co.in (Sudarshan Purohit)
Date: Mon Jun  7 17:00:09 2004
Subject: Automating Search Interfaces
Message-ID: <34EA6699.38FF@pspl.co.in>

Peter Murray-Rust wrote :
> >.....
> >One solution may be to use DTD attributes to cue the search engines.
> >Perhaps a "LEVEL" attribute could cue the searchers to display
> >interfaces to predefined levels. The example below shows that the
> >"LEVEL" attribute means that the "numbeds" element should always .....


I'm rather new to this, so it's possible that i'm thinking wrongly...
Anyhow, I'd like to add one more point to this idea :

When we say that the hotel is making it's data available as XML on the 
web, what it will actually be doing is translating the data in it's
hotel
management database into XML, almost certainly through some Database-XML
interface. This will have to be done at very frequent intervals, in 
both directions, in order to show the latest status of , say, bookings.

	But doing this requires a standardised mechanism to delineate 
the XML elements as tables/columns/keys/other entities according to
their 
DBMS. I feel that this mechanism (say, having an attribute listing this
'level' ) should be worked out so as to facilitate web search engines as
well.

	XML-Data gives the basic features in this respect, by allowing
keys, etc. This could be built upon.
	 Any such XML should also be readable by other similar programs
(say by the software used by a travel agent who adds this into his own 
database) besides casual browsers.
	Does this sound reasonable? 


Sudarshan Purohit		18th feb 98
PSPL,Pune, India		1010hrs.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at quake.net  Wed Feb 18 07:29:53 1998
From: donpark at quake.net (Don Park)
Date: Mon Jun  7 17:00:09 2004
Subject: Automating Search Interfaces 
Message-ID: <002b01bd3c3e$4ca4ce70$2ee044c6@donpark>

I have a tendency to talk about things yet to happen as if I saw it happen,
so I must first beg the reader to understand that what follows are just an
opinion of a man.

>>As particular industries and special interests standardize on their
>>respective DTDs, Internet search engines will have to allow users to
>>search by specific elements contained in those documents. In the typical
>>search scenario, a user would use one of the major search services such
>>as AltaVista or Yahoo. Lets say the user wanted to search across real
>>estate listings, and these listings all used the same DTD. It seems that
>>
>>independent search engines need to interpret the DTD for a class of
>>documents and present a query interface based on that DTD. The question
>>is: how is the search engine to interpret the DTD and build an
>>intelligent interface based on that DTD? Simply listing every element in
>>
>>the DTD is one approach, but an ugly one. Many DTDs will contain
>>numerous elements which would only clutter and confuse a search
>>interface.

Standardized schemas will not be there for some time.  Effects of XML will
be felt by all major industries in the near future, and while there will be
sincere efforts to standardize DTDs in most of the markets, fiercely
competitive markets like the search service market will be slow in
standardizing schemas.  I expect another round of tag wars waged this time
by Yahoo, Excite, AltaVista, MS, etc.  The result will be different this
time in that everyone will agree to disagree in the end and move on to
building tools to bridge the differences in structures of contents which
would have accumulated beyond the point of standardizing.

Schema-based universal search interface will be dead upon arrival.  While it
is possible to build such clients, search services that use them will lose
everytime to services offering hand-crafted search interfaces designed to be
easy to use, relevantly flexible, and visually appealing.

Improved accuracy of search results, brought on by wide availability of
XML-based contents, will be lost to most users.  Consumers simply do not
care as long as they can find what they want among first 100 items returned
by a search.  Search services are free after all and therefore do not place
high expectations.

What consumers will care mostly about is the 'freshness' of search results.
All of the widely used search services are currently selling stale
information, a lot of it damaged goods.  There is not much demand for
freshness now but the need will rise dramatically along with the growth of
e-commerce.  XML will bring on new search services which broadcasts search
requests to hundreds to thousands of 'datasites' to get the freshest goods.
It will take tools to build datasites and applications to create contents
for the datasites.  It is not hard to guess who will be the major player in
the next generation of search services.

What I see happening is proliferation of custom DTDs designed around the
contents.  Amazon will not want to throw out some information just so they
can use some standard DTD.  It is like saying that they will chop your arms
off just so they can use the standard-size coffin.  Amazon will use a custom
DTD designed to hold all of their valuable contents including book reviews.
They will offer some, and definitely not all, layers of the contents to
search services by dynamically mapping its DTD to the search service's DTD.
In another word, DTD used to store content will not necessarily be same as
DTD used to transfer.

It is sad to think so but we will also see more and more contents moving
behind protection.  XML makes 'data-spies', 'data-pirates', and
'data-chop-shops' possible.  You will see 'hot-data' detective robots
roaming the net to see if any piece of a site's data is based on its
clients' data based on some intentional mangling of words and images with
hidden signatures.

I hope I did not upset everyone with my 'it sure is obvious to me' attitude.
My sole intention is to help the XML community.  If I make some money along
the way, I can live with it.  I think <g>.

Sincerely,

Don Park
http://www.quake.net/~donpark/index.html


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From richard at light.demon.co.uk  Wed Feb 18 09:23:48 1998
From: richard at light.demon.co.uk (Richard Light)
Date: Mon Jun  7 17:00:09 2004
Subject: Automating Search Interfaces
In-Reply-To: <34E871B8.CF85BB5F@4thworldtele.com>
Message-ID: <XcUxaGACpp60EwLA@light.demon.co.uk>

In message <34E871B8.CF85BB5F@4thworldtele.com>, John Petit
<jpetit@4thworldtele.com> writes
>By the way, I read your book and found it very informative.

Thank you!

>This is a question about how the search scenario will play out on the
>web once XML becomes widely implemented. I have not seen this
>articulated in any of the specifications or articles on the web thus
>far. In lieu of that, I have imagined how it might work. I would like
>some feedback. Am I way off base?  Naturally the answer will have a big
>impact on the design of search engines and other services that I am
>creating.

John,

You have already had replies that:

- comment on the potential use of 'architecture'-type techniques for
harmonising the semantics of element types in different DTDs, and 
- point out that a suitably designed representation of relational data
in XML will allow SQL-type queries on data that is really a relational
wolf (?!) in XML sheep's clothing

The only thing I would add is that neither approach gives us a query
language for searching information sources that are genuinely XML, not
'relational-in-disguise'.  Peter M-R mentioned on XML-dev a couple of
months ago that he uses XLL expressions as a query language - this is
the only approach that is currently possible within the 'official' XML
world-view.  The SGML world has invented a very exhaustive query
language (SDQL, which lurks within the DSSSL standard) for full SGML
documents, but this is probably inappropriate for the XML world.  (One
weakness of SDQL is that it has a 'read-only' model of the document,
whereas SQL supports table creation and updating.  Depends what you want
from a query language.)

Richard Light.

Richard Light
SGML/XML and Museum Information Consultancy
richard@light.demon.co.uk

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From M.H.Kay at eng.icl.co.uk  Wed Feb 18 14:17:18 1998
From: M.H.Kay at eng.icl.co.uk (Michael Kay)
Date: Mon Jun  7 17:00:09 2004
Subject: Automating Search Interfaces 
Message-ID: <01bd3c78$12313a00$1e09e391@mhklaptop.bra01.icl.co.uk>

>>This is a question about how the search scenario will play out on the
>>web once XML becomes widely implemented

Some suggestions & predictions:

1. The "whole web" search services are not keeping pace with the growth of
the web; they are having to index more selectively and less often. There is
therefore increasing room for more specialised search services. There will
certainly be some that concentrate on a particular domain (say sports
results) and that get to understand the DTDs that are widespread in that
domain. This may in turn act as an incentive to the standardisation of
domain DTDs.

2. Search engines will probably start applying heuristics to the XML
structure
even if they don't know the semantics of the DTD. This comes naturally to
software trying to extract information from raw text. For example, tags with
recognised names such as <TITLE> may raise the weighting of the text
contained therein; tags that contain small amounts of text may be ranked
more highly than tags containing most of the document.

3. Some conventional tags such as <META> may emerge and be used in a wide
range of DTDs if the search engines are known to apply special heuristics to
them. Other conventional tags, e.g. for personal names or places, may also
emerge.

4. The general public is only interested in doing simple searches. In more
specialist communities, query languages that allow the tagging to be
exploited will become available. Many search engines already have languages
that support "field-sensitive" searching and I think these can largely be
applied to XML without extension. Such queries only make sense within the
context of a
single DTD or a family of closely-related DTDs. The "navigational" query
languages such as the XLL syntax or DSQL are too precise and too complex for
free text searching.

5. XML may start to become a vehicle for a site to publish an abstract of
itself. Search services, rather than indexing all the content of a site
(which is becoming unviable) will start to index the published abstracts of
sites, and having directed the enquirer towards a site, will then delegate
the within-site searching to a search engine at the site itself.

========================================================
By the way, does anyone know of a search engine (I mean software, not a web
service) that understands XML? I have been looking at writing an IFilter
interface for Microsoft's Index Server and it's rather daunting, especially
as MS will presumably produce one themselves within a year.
========================================================

Regards,

Mike Kay, ICL


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ak117 at freenet.carleton.ca  Wed Feb 18 15:40:53 1998
From: ak117 at freenet.carleton.ca (David Megginson)
Date: Mon Jun  7 17:00:09 2004
Subject: Automating Search Interfaces 
In-Reply-To: <01bd3c78$12313a00$1e09e391@mhklaptop.bra01.icl.co.uk>
References: <01bd3c78$12313a00$1e09e391@mhklaptop.bra01.icl.co.uk>
Message-ID: <199802181539.KAA00290@unready.microstar.com>

Michael Kay writes:

 > By the way, does anyone know of a search engine (I mean software,
 > not a web service) that understands XML? I have been looking at
 > writing an IFilter interface for Microsoft's Index Server and it's
 > rather daunting, especially as MS will presumably produce one
 > themselves within a year.

You could customise OpenText's LiveLink search to handle XML using
ranges -- the level of effort would depend on the skill and experience
of your programmers (anywhere from a couple of days to a couple of
months).


All the best,


David

-- 
David Megginson                 ak117@freenet.carleton.ca
Microstar Software Ltd.         dmeggins@microstar.com
      http://home.sprynet.com/sprynet/dmeggins/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From murata at apsdc.ksp.fujixerox.co.jp  Thu Feb 19 01:39:08 1998
From: murata at apsdc.ksp.fujixerox.co.jp (MURATA Makoto)
Date: Mon Jun  7 17:00:09 2004
Subject: Namespaces, Architectural Forms, and Sub-Documents
In-Reply-To: <3.0.1.16.19980206082831.1157496c@pop3.demon.co.uk>
Message-ID: <199802190139.AA00252@murata.apsdc.ksp.fujixerox.co.jp>


In message "Re: Namespaces, Architectural Forms, and Sub-Documents", Peter Murray-Rust 
wrote...
> I hope that the "disgusting" refers to the use of 'img' and 'src' and the
> implied semantics rather than the mechanism :-).  I am an advocate of the
> *mechanism* (e.g
> http://www.vsms.nottingham.ac.uk/vsms/talks/chemwebvei/020.html) where I
> use XML-LINK explicitly to combine chemistry, maths and text. This has the
> advantage that it avoids namespace problems. It also allows me to process
> foreign files if certain assumptions are made.

I  think that your approach works.  Do you think that this is the way 
to go?  I.e., no namespace mechanisms but links only?  Or, do you think 
that it should be possible to convert the link-based representation to 
the namespace-based representation and vice versa?

Cheers,

Makoto
 
Fuji Xerox Information Systems
 
Tel: +81-44-812-7230   Fax: +81-44-812-7231
E-mail: murata@apsdc.ksp.fujixerox.co.jp

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From eliot at isogen.com  Thu Feb 19 02:04:33 1998
From: eliot at isogen.com (W. Eliot Kimber)
Date: Mon Jun  7 17:00:09 2004
Subject: Namespaces, Architectural Forms, and Sub-Documents
Message-ID: <3.0.32.19980218195752.00be8650@swbell.net>

At 10:39 AM 2/19/98 +0900, MURATA Makoto wrote:
>
>In message "Re: Namespaces, Architectural Forms, and Sub-Documents", Peter
Murray-Rust 
>wrote...
>> I hope that the "disgusting" refers to the use of 'img' and 'src' and the
>> implied semantics rather than the mechanism :-).  I am an advocate of the
>> *mechanism* (e.g
>> http://www.vsms.nottingham.ac.uk/vsms/talks/chemwebvei/020.html) where I
>> use XML-LINK explicitly to combine chemistry, maths and text. This has the
>> advantage that it avoids namespace problems. It also allows me to process
>> foreign files if certain assumptions are made.
>
>I  think that your approach works.  Do you think that this is the way 
>to go?  I.e., no namespace mechanisms but links only?  Or, do you think 
>that it should be possible to convert the link-based representation to 
>the namespace-based representation and vice versa?

My vote is for the link-based approach (which in HyTime is provided by the
value reference facility, which lets you distinquish simple
use-by-reference from true hyperlinks).  A processor can always generate
new combined instances using whatever approach it cares to to disambiguate
name clashes, including using name spaces.

Syntactic combination is ultimately limiting and largely unnecessary if you
can do your combining at the semantic level.  However, semantic-level
combination does have a cost because you can't necessarily depend on the
limitations of syntactic constraints to keep things simple.

Cheers,

E.
--
<Address HyTime=bibloc>
W. Eliot Kimber, Senior Consulting SGML Engineer
Highland Consulting, a division of ISOGEN International Corp.
2200 N. Lamar St., Suite 230, Dallas, TX 95202.  214.953.0004
www.isogen.com
</Address>

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jeremie at netins.net  Thu Feb 19 06:44:13 1998
From: jeremie at netins.net (Jeremie Miller)
Date: Mon Jun  7 17:00:09 2004
Subject: Update: Xparse(JavaScript XML Parser)
Message-ID: <008a01bd3d01$9d82d260$2801a8c0@jeremie.dbqglass.com>

I've updated my JavaScript based XML parser at:
http://www.jeremie.com/Dev/XML/

I added lots of little support things and it now correctly supports all of
the goofy formatting things like:
<tag
  name
 =
"value" id       =
         "abc123"
>
the tags contents
</tag

   >

It's basically done with the exception of full error reporting and
processing any DTD related information.  I'm waiting to add DOM support
before I attempt to tackle either of those, and the DOM support is waiting
for the release of the DOM ECMAScript Core API definitions(Appendix C in the
current WD).  So in the meantime I'm going to concentrate on my JavaScript
based XSL parser :)

If you're willing to wrap your XML data in a <TEXTAREA> or escape it into a
JavaScript string variable, Xparse is a slick and fast way to manipulate it
on the client end.

Comments/ideas welcome!

Jeremie Miller
jer@jeremie.com
http://www.jeremie.com/


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From terje at in-progress.com  Thu Feb 19 08:14:56 1998
From: terje at in-progress.com (terje@in-progress.com)
Date: Mon Jun  7 17:00:09 2004
Subject: XPublish 1.0 candidate (XML website publishing system on Mac)
Message-ID: <b111961b02021004e8cc@[199.106.6.97]>

XPublish (read: "CrossPublish") is a complete Macintosh XML based website
publishing system that automatically generates HTML websites from XML
documents.

The application is currently under finetuning to verify that it follows the
final XML 1.0 specification. We now solicit feedback from XML savvy, and
offer a solid pre-release discount (in addition to our appreciation) for
those that take the time to check out the application and report eventual
inconsistencies.

You are invited to download the candidate of XPublish 1.0 from:

   http://interaction.in-progress.com/xpublish

XPublish supports efficient development and maintainance of websites with
XML. The built-in Cascading StyleSheets designer fosters a consistent look
& feel of the sites. The application's capabilities includes to render XML
into HTML with markup-emulated style sheet for older browsers that doesn't
support CSS, facilitating faster deployment of XML among webmasters and
demonstrating the processing power of XPublish. The distribution comes with
a tutorial that gives HTML authors a gentle introduction to XML markup.

Subscribe to the XPublish mailing list to receive updated information about
XML, XPublish and website publishing with XML. Send the subscription
request to <xpublish-request@in-progress.com> included your name and email
address to join the mailing list.

-- Terje <Terje@in-progress.com> | Media Design in*Progress

   C a s c a d e... a comprehensive Cascading Style Sheets editor for Mac
   XPublish - for efficient website publishing with XML
   Make your Web Site a Social Place with Interaction!

   Check out our web tools at <http://interaction.in-progress.com>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ak117 at freenet.carleton.ca  Thu Feb 19 20:26:20 1998
From: ak117 at freenet.carleton.ca (David Megginson)
Date: Mon Jun  7 17:00:09 2004
Subject: Announcement: New PSGML-XML Additions
Message-ID: <199802192025.PAA00328@unready.microstar.com>

I've updated my XML patches for PSGML.  I've had very little time to
devote to this, but I've managed to make two important changes, at
least (the others are still in the queue):

1) Fixed a highlight-related PSGML bug that caused errors when there
   was a processing instruction before the DOCTYPE declaration (this
   is a big problem in XML, for obvious reasons).

2) Fixed PSGML's support for the `sgml-system-path' variable, and set
   the initial value of the variable automatically from the
   environment variable SGML_SEARCH_PATH (as used by NSGMLS), if
   present.

The second one turns out to be a very useful change.  If you do
something like

  (setq sgml-system-path '("." "/usr/local/lib/sgml/global"))

or (for NSGMLS support as well)

  export SGML_SEARCH_PATH
  SGML_SEARCH_PATH=".:/usr/local/lib/sgml/global"

and then put the file `spec.dtd' in /usr/local/lib/sgml/global, then
you can always reference that DTD with a relative URL as if it were in
the current directory (NSGMLS has always allowed this, but it wasn't
fully implemented in PSGML).  That means that

  <!DOCTYPE spec SYSTEM "spec.dtd">

works, and you no longer have to copy the DTD file into every
directory that uses it.  I've also fixed the parsing of environment
variables so that ';' can be the separator in DOS/Windows, though I
haven't tested that part yet.

You can download the patches from my home page,


  http://home.sprynet.com/sprynet/dmeggins/


Have fun!


David

-- 
David Megginson                 ak117@freenet.carleton.ca
Microstar Software Ltd.         dmeggins@microstar.com
      http://home.sprynet.com/sprynet/dmeggins/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Jon.Bosak at eng.Sun.COM  Fri Feb 20 00:33:50 1998
From: Jon.Bosak at eng.Sun.COM (Jon Bosak)
Date: Mon Jun  7 17:00:09 2004
Subject: Final XML conference schedule
Message-ID: <199802200031.QAA27394@boethius.eng.sun.com>

I've been asked to note that the final agenda for the XML Conference
is now available on the GCA Web site:

   www.gca.org/conf/xmlcon98/

Jon

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From peter at ursus.demon.co.uk  Fri Feb 20 01:11:06 1998
From: peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun  7 17:00:09 2004
Subject: Automating Search Interfaces"
Message-ID: <3.0.1.16.19980220010616.44af540a@pop3.demon.co.uk>

Forwarded from John Petit...

>Cheers, John Petit
>
>Title  "Re:Automating Search Interfaces""
>---------------------------------------------------------------------------
----------------
>
> Don Park writes:
>
>>>Standardized schemas will not be there for some time. Effects of XML
>will
>>>be felt by all major industries in the near future, and while there
>will be
>>>sincere efforts to standardize DTDs in most of the markets, fiercely
>>>competitive markets like the search service market will be slow in
>>>standardizing schemas. I expect another round of tag wars waged this
>time
>>>by Yahoo, Excite, AltaVista, MS, etc. The result will be different
>this
>>>time in that everyone will agree to disagree in the end and move on to
>
>>>building tools to bridge the differences in structures of contents
>which
>>>would have accumulated beyond the point of standardizing.
>
>I agree that this disheartening scenario is quite possible. But what a
>shame! It seems that one of XML's major strength's is its ability to
>search heterogeneous databases. Independent sellers large and small
>would benefit from heterogeneous searches for it would allow super
>accurate marketing. Mom and pop producers should be able to sell their
>boutique goods to the special set of consumers that would be interested.
>A real estate agent in Backwater USA with a unique property should be
>able to sell that product in an industry standard search engine.
>Without accurate, industry specific search interfaces, consumers will
>not easily find these sites. Otherwise we are no better off search wise
>than we are today ? wallowing in inaccurate searches. It would be a real
>shame if the ultimate promises of XML were hindered by lack of
>planning.  Laissez-faire is not always the best way.
>
>Perhaps what would help is to create a central repository for major
>industry DTDs. Such a repository may reduce the effects of splintering,
>and accelerate development. DTD authors could see what has come before
>them and either borrow from it or at least learn from it. I have always
>felt that such a site would be useful in DTD development. There are
>probably dozens of nascent DTD efforts going on in various industries.
>Each one inventing the wheel. In many cases these authors are describing
>the same element with different names when they could just as easily use
>the same name.
>
>Taking biological evolution as an analogy, putting the DTDs in one small
>pool will encourage faster and more sympathetic development. Otherwise,
>isolated cyber ecosystems will encourage divergent DTD evolution and
>this will lead to a long and vicious "survival of the fittest" scenario
>that will not benefit anyone.
>
>I cannot speak for Robin Cover but the SGML/XML Web Page seems like a
>good candidate for such a DTD repository.
>
>>>Schema-based universal search interface will be dead upon arrival.
>While it
>>>is possible to build such clients, search services that use them will
>lose
>>>everytime to services offering hand-crafted search interfaces designed
>to be
>>>easy to use, relevantly flexible, and visually appealing.
>
>It is true that hand crafted search interfaces would be more polished,
>but who should be responsible for their creation. Is there some
>designated Java developer in the hotel industry that will make a search
>engine selflessly for the entire industry. No. If such work is relegated
>to the private companies then such search engines will not represent the
>entire industry in a unbiased way. This leaves nice, but proprietary
>search engines, and we are right back to where we started from; searches
>of privately selected database rather than searches of heterogeneous,
>industry representative databases.
>
>>>Improved accuracy of search results, brought on by wide availability
>of
>>>XML-based contents, will be lost to most users. Consumers simply do
>not
>>>care as long as they can find what they want among first 100 items
>returned
>>>by a search. Search services are free after all and therefore do not
>place
>>>high expectations.
>
>I do not feel that consumers will not care about search accuracy. When a
>customer is looking for variations of Ginkgo Biloba (an over-the-counter
>drug) they want to see all the sites that sell it and for what price.
>The same is true for travelers looking for room availability at their
>travel destinations. No one wants to wade though a hundred tangentially
>related sites. Without accurate search interfaces, consumers will not
>get this sort of accurate response. The RDF is an important part of
>describing the web, but I have not seen how it would right way to
>address automating search interfaces.
>
>
Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic
net connection
VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary
http://www.venus.co.uk/vhg

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From peter at ursus.demon.co.uk  Fri Feb 20 01:15:12 1998
From: peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun  7 17:00:09 2004
Subject: Namespaces, Architectural Forms, and Sub-Documents
In-Reply-To: <199802190139.AA00252@murata.apsdc.ksp.fujixerox.co.jp>
References: <3.0.1.16.19980206082831.1157496c@pop3.demon.co.uk>
Message-ID: <3.0.1.16.19980220010622.29d7d178@pop3.demon.co.uk>

At 10:39 19/02/98 +0900, MURATA Makoto wrote:
>
>In message "Re: Namespaces, Architectural Forms, and Sub-Documents", Peter
Murray-Rust 
>wrote...
>> I hope that the "disgusting" refers to the use of 'img' and 'src' and the
>> implied semantics rather than the mechanism :-).  I am an advocate of the
>> *mechanism* (e.g
>> http://www.vsms.nottingham.ac.uk/vsms/talks/chemwebvei/020.html) where I
>> use XML-LINK explicitly to combine chemistry, maths and text. This has the
>> advantage that it avoids namespace problems. It also allows me to process
>> foreign files if certain assumptions are made.
>
>I  think that your approach works.  Do you think that this is the way 

Thank you. I should perhaps make it clear that the diagram was slightly
hypothetical (i.e. not a screenshot from JUMBO. I did at one stage manage
to EMBED a  molecule in an event stream but it wasn't stable). At present
JUMBO will manually deal with linked resources and treat them as separate
trees. NEW and REPLACE are easily catered for; EMBED is a problem  since it
has little meaning in a tree and for text event stream I am still deciding
on the best way to arrange flow objects for non-conventional objects (e.g.
maths, molecules, name-value pairs, etc.) Also the 'hypertext' support that
Java gives is hardly exciting.


>to go?  I.e., no namespace mechanisms but links only?  Or, do you think 
>that it should be possible to convert the link-based representation to 
>the namespace-based representation and vice versa?

[There is a current SIG/WG discussion on namespaces which I cannot publicly
comment on. My private view is that I shall wait-and-see what comes out;
from my point of view it's not trivial.]

I suspect that namespaces and links will co-exist.  I am certainly gently
tooling up for each of them. My little experiment with JUMBO-PLAY shows
both approaches. (Although only a single namespace is involved, I have
prefixed the output of my play.SAXSplit with PLAY:) 

The advantage of a single monolithic document is it's easier to traverse
(e.g. searches). Its disadvantage (for JUMBO) is that it can overflow the
JVM. The namespaces are explicitly expanded (i.e. every element name has a
namespace prefix). I would find scoping quite difficult until the rules are
VERY clear. It is very difficult to build a prototype system if one is not
sure what one *should* be doing. (This is distinct from not knowing what
one is doing, which is permanent :-). Certainty in the goal makes
programming about half an order of magnitude easier. Thus, for example, I
don't know 100% whether we shall have prefixes on attributes.

Note that one advantage of links is that what is hung on the end need not
originally be an XML document. I frequently parse legacy documents into
trees on demand. Maybe this could be managed by notation and embedded
'binary', but I don't understand that yet :-)

>
>Cheers,

Cheers

	P.

Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic
net connection
VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary
http://www.venus.co.uk/vhg

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From parikr at pointcast.com  Fri Feb 20 04:22:08 1998
From: parikr at pointcast.com (Parik Rao)
Date: Mon Jun  7 17:00:09 2004
Subject: Is anyone using CDATA?
Message-ID: <93DA154E07D3D0119C7E006097743AA0F5B40E@hq-exs1.pointcast.com>

Anyone have experiences with CDATA ?  We're interested in inserting
non-XML markup and BLOBs into XML files, and the best way seems to be
CDATA.  However, some of the parsers I've been playing around with
(Microsoft, XMLint) don't support the CDATA element.  Is CDATA handling
required for a validating parser?

For non-XML markup (HTML markup), I could escape the markup and insert
it under my own elements, but that requires extra processing and makes
documents larger.  For BLOBs, obviously pointers to the data rather than
embedded could be done.  But its can be useful to package all required
data into a single file sometimes.

Interested in how others are dealing with the situation...

--
Parik Rao
parikr@pointcast.com
PointCast, Inc.
http://www.pointcast.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mrc at allette.com.au  Fri Feb 20 05:11:59 1998
From: mrc at allette.com.au (Marcus Carr)
Date: Mon Jun  7 17:00:10 2004
Subject: Automating Search Interfaces"
References: <3.0.1.16.19980220010616.44af540a@pop3.demon.co.uk>
Message-ID: <34ED106E.C3CFA9EF@allette.com.au>

John Petit wrote:

> It is true that hand crafted search interfaces would be more polished,
> but who should be responsible for their creation. Is there some
> designated Java developer in the hotel industry that will make a search
> engine selflessly for the entire industry. No. If such work is relegated
> to the private companies then such search engines will not represent the
> entire industry in a unbiased way. This leaves nice, but proprietary
> search engines, and we are right back to where we started from; searches
> of privately selected database rather than searches of heterogeneous,
> industry representative databases.

I wonder what would happen if Alta Vista, Yahoo et al started supporting 'index
DTDs' of their own making, written for particular industries and designed as an
interface layer to the search engine. The data owners would be responsible for
the creation/generation of these very skinny documents and the embedded links to
the richer versions. If these DTDs were regarded as being a subset of the data
strictly for the purpose of searching (rather than for more general information
storage), the DTD would primarily suit the search engine and need show no bias
toward any particular industry group. The hits could be ranked more highly than
those found by standard means and would probably be more valuable to users. Then
the search engine builders could start supporting each others DTDs in search of
commercial advantage, etc...

This would leave the technical responsibility and potential financial gain to a
group who have no other interest other than making data findable. This sounds too
good to be true, so almost certainly is.

> I do not feel that consumers will not care about search accuracy. When a
> customer is looking for variations of Ginkgo Biloba (an over-the-counter
> drug) they want to see all the sites that sell it and for what price.
> The same is true for travelers looking for room availability at their
> travel destinations. No one wants to wade though a hundred tangentially
> related sites.

I agree. I think users want to feel that buzz that you get from finding the right
site on the first try, despite the use of somewhat dubious search criteria.


--
Regards

Marcus Carr                  email:  mrc@allette.com.au
_______________________________________________________________
Allette Systems (Australia)  email:  info@allette.com.au
Level 10, 91 York Street     www:    http://www.allette.com.au
Sydney 2000 NSW Australia    phone:  +61 2 9262 4777
                             fax:    +61 2 9262 4774
_______________________________________________________________


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From peter at ursus.demon.co.uk  Fri Feb 20 07:40:10 1998
From: peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun  7 17:00:10 2004
Subject: Is anyone using CDATA?
In-Reply-To: <93DA154E07D3D0119C7E006097743AA0F5B40E@hq-exs1.pointcast.c
 om>
Message-ID: <3.0.1.16.19980220064032.1b2f441c@pop3.demon.co.uk>

At 20:21 19/02/98 -0800, Parik Rao wrote:
>Anyone have experiences with CDATA ?  We're interested in inserting
>non-XML markup and BLOBs into XML files, and the best way seems to be
>CDATA.  However, some of the parsers I've been playing around with
>(Microsoft, XMLint) don't support the CDATA element.  Is CDATA handling
>required for a validating parser?

An "XML parser" must conform to the XML spec and must therefore *read
correctly* a document which includes <![CDATA[ ... ]]> 

There is no requirement for a parser to DO anything with this other than to
report violations of well-formedness (or validity) as appropriate. The SAX
API does not support <![CDATA[; i.e. the output has lost any knowledge of
what bits were originally CDATA and which were escaped with &amp;, etc.

	P.

Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic
net connection
VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary
http://www.venus.co.uk/vhg

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at quake.net  Fri Feb 20 08:33:18 1998
From: donpark at quake.net (Don Park)
Date: Mon Jun  7 17:00:10 2004
Subject: Is anyone using CDATA?
Message-ID: <006c01bd3dd9$8048bc90$2ee044c6@donpark>

Parik,

>Anyone have experiences with CDATA ?  We're interested in inserting
>non-XML markup and BLOBs into XML files, and the best way seems to be
>CDATA.  However, some of the parsers I've been playing around with
>(Microsoft, XMLint) don't support the CDATA element.  Is CDATA handling
>required for a validating parser?

As far as I know, yes.  Version 1.8 of MSXML does handle CDATA sections.  I
don't know about XMLint.  AElfred also supports CDATA.  With SAX, you can
get CDATA section contents but it will appear as characters.  This causes
extra processing burden on some conversion applications (i.e. XML to XML)
but it is not a serious problem, just a boon for Intel.

>For non-XML markup (HTML markup), I could escape the markup and insert
>it under my own elements, but that requires extra processing and makes
>documents larger.  For BLOBs, obviously pointers to the data rather than
>embedded could be done.  But its can be useful to package all required
>data into a single file sometimes.

You can compress the HTML markup and write it out with BASE64 encoding.  I
am in the process of putting together a proposal for embedding binary data
in XML documents.  It is tentatively named XML-Binary proposal.  I will be
posting a draft on this mailing list for comments before submitting it to
W3C.

As far as packaging goes, there is at least one person working on that
although I can not go into details due to his request for confidentiality.
Perhaps he can elaborate some more.

I hope this helps.

Don Park
http://www.quake.net/~donpark/index.html


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at quake.net  Fri Feb 20 09:48:32 1998
From: donpark at quake.net (Don Park)
Date: Mon Jun  7 17:00:10 2004
Subject: Automating Search Interfaces"
Message-ID: <008401bd3de3$f8c93b90$2ee044c6@donpark>

>I agree that this disheartening scenario is quite possible. But what a
>shame! It seems that one of XML's major strength's is its ability to
>search heterogeneous databases. Independent sellers large and small
>would benefit from heterogeneous searches for it would allow super
>accurate marketing. Mom and pop producers should be able to sell their
>boutique goods to the special set of consumers that would be interested.
>A real estate agent in Backwater USA with a unique property should be
>able to sell that product in an industry standard search engine.
>Without accurate, industry specific search interfaces, consumers will
>not easily find these sites. Otherwise we are no better off search wise
>than we are today ? wallowing in inaccurate searches. It would be a real
>shame if the ultimate promises of XML were hindered by lack of
>planning.  Laissez-faire is not always the best way.

As far as I am concerned, the scenario is not only possible, it is
absolutely the only way the history will unfold because content developers
will find it hard to convert non-XML contents into XML using standard DTDs.
Commercial contents are typically composite data which can not be easily
described with a set of standard schemas.  Search services makes it even
worse because their schema requirement will be far less than that of content
providers.

Search across heterogeneous databases can still be achieved without asking
everyone to put on a straightjacket and wiggleahead at manageable speed for
the benefit of mankind.  The key lies in dynamic schema conflict resolution
technologies.  If search service wants the price in pesos and the database
stores  prices in US dollars, price can be converted by an adapter at the
time of demand using currency market datafeed.  Currency conversion can not
be done beforehand nor cached because its shelf-life is basically counted in
minutes.  It is also quite unfriendly to return search results with prices
in ten different currencies.

Also standard DTDs can not adapt to change.  What do you do when the
standard DTD for electronic devices must be changed to include performance
data (i.e. WinMark for Intel machines)?  The problems are simply
mindboggling (well, my mind is easy to boggle).


> It is true that hand crafted search interfaces would be more polished,
> but who should be responsible for their creation. Is there some
> designated Java developer in the hotel industry that will make a search
> engine selflessly for the entire industry. No. If such work is relegated
> to the private companies then such search engines will not represent the
> entire industry in a unbiased way. This leaves nice, but proprietary
> search engines, and we are right back to where we started from; searches
> of privately selected database rather than searches of heterogeneous,
> industry representative databases.

Search companies will attack one industry at a time with the search company
providing the custom user interface and dictating what the DTD should be.
Each attack will be turned into a press event with announcements of support
from major players in that particular industry.  These companies will
announce that they will provide data using the search company's
industry-specific search DTD.  Small companies with less resources will
provide using push model since they do not have the resources for taking
part in distributed search network.  Large companies will place more value
in their data and will provide information on demand, thus taking part in
the search network.

>I do not feel that consumers will not care about search accuracy. When a
>customer is looking for variations of Ginkgo Biloba (an over-the-counter
>drug) they want to see all the sites that sell it and for what price.
>The same is true for travelers looking for room availability at their
>travel destinations. No one wants to wade though a hundred tangentially
>related sites. Without accurate search interfaces, consumers will not
>get this sort of accurate response. The RDF is an important part of
>describing the web, but I have not seen how it would right way to
>address automating search interfaces.

This was an exaggeration on my part.  I appologize.

Have said all this.  I still feel that efforts to standardize DTDs must be
made and must be maintained for the sake of balance and stability.  There
wouldn't be much of a market if everyone used their own currency as their
picture ID.

Prophet for Profit,

Don Park
http://www.quake.net/~donpark/index.html


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From peter at ursus.demon.co.uk  Fri Feb 20 10:28:49 1998
From: peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun  7 17:00:10 2004
Subject: rec.xml
Message-ID: <3.0.1.16.19980220095415.2247b80e@pop3.demon.co.uk>

Here is something bit me unexpectedly, and I'd be interested in comments on
it. I *think* I know the answer. I'll leave it to you to think about before
you rush for your parsers to check.

Using SAX (alone) to parse the XML version of the XML recommendation
(rec.xml), is it  possible to create a well-formed version? The first time
I tried this the result surprised me.

	P.

BTW there may be problems parsing rec.xml as the official version contains
a (single) character #160 (&nbsp;). This has actually been 'commented out'
but parsers such as AElfred don't accept it and throw an error. DavidM
assures me that this is the correct thing to do - I take this on trust. So,
if you wish to use AElfred on this you'll have to find the #160 (it appears
as a aacute; on my editor - yours may vary and even show a 'space'). This
has nothing to do with the little amusement above...

BTW I have asked if there is a file spec.dtd, and no doubt this will be
announced here if/when.


Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic
net connection
VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary
http://www.venus.co.uk/vhg

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From murata at apsdc.ksp.fujixerox.co.jp  Fri Feb 20 11:49:47 1998
From: murata at apsdc.ksp.fujixerox.co.jp (MURATA Makoto)
Date: Mon Jun  7 17:00:10 2004
Subject: Announcement: New PSGML-XML Additions
In-Reply-To: <199802192025.PAA00328@unready.microstar.com>
Message-ID: <199802201150.AA00268@murata.apsdc.ksp.fujixerox.co.jp>

PSGML-XML works on Meadow (Multilingual enhancement to gnu Emacs 
with ADvantages Over Windows).   Meadow is Emacs20 on MS Windows.  It 
is fully internationalized (but no UTF-16 yet).  It was recently 
released by Miyashita Hisashi <himi@bird.scphys.kyoto-u.ac.jp>.  
(Miyashita is his family name.) 

Meadow is available from

	ftp://ftp.etl.go.jp/pub/mule/Windows/Meadow-1.00-i386.tar.gz

INSTALL.Meadow and README.Meadow are written in English.

Makoto
 
Fuji Xerox Information Systems
 
Tel: +81-44-812-7230   Fax: +81-44-812-7231
E-mail: murata@apsdc.ksp.fujixerox.co.jp

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ak117 at freenet.carleton.ca  Fri Feb 20 12:16:39 1998
From: ak117 at freenet.carleton.ca (David Megginson)
Date: Mon Jun  7 17:00:10 2004
Subject: Is anyone using CDATA?
In-Reply-To: <93DA154E07D3D0119C7E006097743AA0F5B40E@hq-exs1.pointcast.com>
References: <93DA154E07D3D0119C7E006097743AA0F5B40E@hq-exs1.pointcast.com>
Message-ID: <199802201215.HAA00686@unready.microstar.com>

Parik Rao writes:

 > Anyone have experiences with CDATA ?  We're interested in inserting
 > non-XML markup and BLOBs into XML files, and the best way seems to be
 > CDATA.  However, some of the parsers I've been playing around with
 > (Microsoft, XMLint) don't support the CDATA element.  Is CDATA handling
 > required for a validating parser?

I, at least, cannot reproduce your bug -- with MSXML, the following
document parses exactly as expected:

<?xml version="1.0"?>
<listing>
<![CDATA[<a></a>]]>
</listing>

That said, CDATA marked sections won't always work for you -- BLOBs
are likely to contain non-SGML characters, and any arbitrary non-XML
markup containing ']]>' will kill the marked section.  The best way to
include arbitrary non-XML information in a document is to include it
as an unparsed entity or an HREF link (just as you would include a GIF
in an HTML page).


All the best,


David

-- 
David Megginson                 ak117@freenet.carleton.ca
Microstar Software Ltd.         dmeggins@microstar.com
      http://home.sprynet.com/sprynet/dmeggins/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ak117 at freenet.carleton.ca  Fri Feb 20 12:29:54 1998
From: ak117 at freenet.carleton.ca (David Megginson)
Date: Mon Jun  7 17:00:10 2004
Subject: rec.xml
In-Reply-To: <3.0.1.16.19980220095415.2247b80e@pop3.demon.co.uk>
References: <3.0.1.16.19980220095415.2247b80e@pop3.demon.co.uk>
Message-ID: <199802201228.HAA00784@unready.microstar.com>

Peter Murray-Rust writes:

 > Using SAX (alone) to parse the XML version of the XML
 > recommendation (rec.xml), is it possible to create a well-formed
 > version? The first time I tried this the result surprised me.

James Clark has created the Java application XMLTest to do exactly
this:

  http://www.jclark.com/xml/XMLTest.java

I just normalised the REC with the following command line:

  java XMLTest com.microstar.sax.AElfredDriver /tmp
  REC-xml-19980210.xml

It seems to have come out fine (though without XML declaration,
comments, DOCTYPE, etc.).  The purpose of James's application is to
allow easy comparisons of different SAX drivers and parsers.

 > BTW there may be problems parsing rec.xml as the official version
 > contains a (single) character #160 (&nbsp;).

The problem has been fixed in the REC.

Parsing the REC no longer causes problems for AElfred because the
REC's XML declaration declares the encoding as "ISO-8859-1", where
#160 is a legal character.  The problem is that not all XML parsers
allow the declared encoding ISO-8859-1 (though that's what most of
them really support).

 > This has actually been 'commented out' but parsers such as AElfred
 > don't accept it and throw an error. DavidM assures me that this is
 > the correct thing to do - I take this on trust.

This is _a_ correct thing to do.  This is an error but not a fatal
error, so it is up to the parser whether or not to report it.  That
said, any parser with actual UTF-8 support will somehow choke on #160
if it thinks it's parsing UTF-8.  Right now, most parsers claim to be
parsing UTF-8 when they're really parsing ISO-8859-1, hence they don't
choke on #160.


All the best,


David

-- 
David Megginson                 ak117@freenet.carleton.ca
Microstar Software Ltd.         dmeggins@microstar.com
      http://home.sprynet.com/sprynet/dmeggins/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From matthewg at poet.de  Fri Feb 20 13:12:54 1998
From: matthewg at poet.de (Matthew Gertner)
Date: Mon Jun  7 17:00:10 2004
Subject: Automating Search Interfaces"
Message-ID: <01bd3e01$168b28b0$a00b0ac0@pharcyde.poetsoftware.xo.com>

Don,

<snip>
> Also standard DTDs can not adapt to change.  What do you do when the
> standard DTD for electronic devices must be changed to include performance
> data (i.e. WinMark for Intel machines)?  The problems are simply
> mindboggling (well, my mind is easy to boggle).

One approach that really appeals to me is based on a two-pronged effort to
create standard tags *and* standard DTDs, and relies on the fact that there
is really a working mechanism for extending DTDs through inheritance (which
I guess is still not entirely the case).

Standard tags would be a bit of a hack, but probably very useful in a
pragmatic sense. For example, you might be able to say certain things about
a TITLE tag, or a PRICE tag, or whatever, just on the basis of the name,
regardless of the actual DTD being used. If these conventions were
well-known, this could be of great use when defining a new DTD (i.e. "Let's
call the tag PARAGRAPH and not PARA because this is what will be recognized
by search engines").

Inheritance is *not* a hack and really seems like the way to go for more
ambitious implementations. To take your example, the DTD for electronic
devices might contain tags for VENDOR, PRODUCTNAME, PRICE, CATEGORY, etc. If
I want to find all CD player devices from Sony that cost less than $99 then
I can query based on this standard DTD. Vendors who want to include more
information just derive a new DTD with all the standard tags, as well as
vendor-specific ones (for benchmark figures, for example). The non-standard
tags may not be available for querying, but the information in the
standardized base DTD would be.

This becomes even more powerful with multiple inheritance. I can whip up a
DTD for my new portable XML viewer/expresso brewer, imported from
Kazakhstan, just by grapping the standard DTDs for hand-held electronic
devices (derived from general electronic devices but adding tags for SIZE,
WEIGHT and BATTERYLIFE), for food processing equipment (also derived from
electronic devices but a tag for FOODTYPE) and for imported goods (with tags
for COUNTRYOFORIGIN, EXPORTTARIF, etc.). This would let users find my
product by querying for all portable devices weighing under 200 grams which
can process coffee and which are produced in Central Asia.

I really believe the world needs XML to get a grip on information explosion.
The approach suggested by the original poster is great, and with
plug-and-play DTDs I don't see any real technical reason why it shouldn't
work. As an initial implementation, the approach based on GI only would no
doubt be a good workaround.

Matthew


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From M.H.Kay at eng.icl.co.uk  Fri Feb 20 16:03:37 1998
From: M.H.Kay at eng.icl.co.uk (Michael Kay)
Date: Mon Jun  7 17:00:10 2004
Subject: Is anyone using CDATA?
Message-ID: <01bd3e19$4288ad80$1e09e391@mhklaptop.bra01.icl.co.uk>

>Anyone have experiences with CDATA ?  We're interested in inserting
>non-XML markup and BLOBs into XML files, and the best way seems to be
>CDATA.

I don't think CDATA is useful for inserting binary data into XML files,
because there is no way of escaping the terminating "]]>". I think the best
way to do it, if you want to do it inline, is to use Base64 encoding, and
then
you don't need CDATA.

Mike Kay


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ak117 at freenet.carleton.ca  Fri Feb 20 19:44:44 1998
From: ak117 at freenet.carleton.ca (David Megginson)
Date: Mon Jun  7 17:00:10 2004
Subject: Automating Search Interfaces"
In-Reply-To: <01bd3e01$168b28b0$a00b0ac0@pharcyde.poetsoftware.xo.com>
References: <01bd3e01$168b28b0$a00b0ac0@pharcyde.poetsoftware.xo.com>
Message-ID: <199802201535.KAA00874@unready.microstar.com>

Matthew Gertner writes:

 > One approach that really appeals to me is based on a two-pronged effort to
 > create standard tags *and* standard DTDs, and relies on the fact that there
 > is really a working mechanism for extending DTDs through inheritance (which
 > I guess is still not entirely the case).
 > 
 > Standard tags would be a bit of a hack, but probably very useful in a
 > pragmatic sense. For example, you might be able to say certain things about
 > a TITLE tag, or a PRICE tag, or whatever, just on the basis of the name,
 > regardless of the actual DTD being used. If these conventions were
 > well-known, this could be of great use when defining a new DTD (i.e. "Let's
 > call the tag PARAGRAPH and not PARA because this is what will be recognized
 > by search engines").

The idea is actually quite sound, but the implementation could be a
little cleaner.  Instead of relying on the element type name (which
may vary for different domains of information), why not have a
standard attribute (such as 'standard-doc') that gives the equivalent
standard name in the architecture.  That way, just as you write

  public class Cost implements Price {
  }

in Java, you can write

  <!ELEMENT cost (#PCDATA)>
  <!ATTLIST cost
    standard-doc CDATA #FIXED "price">

in XML, or even

  <cost standard-doc="price">xxx</foo>

This makes multiple inheritance easy:

  <!ELEMENT cost (#PCDATA)>
  <!ATTLIST cost
    standard-doc CDATA #FIXED "price"
    alt-doc CDATA #FIXED "value">

Now, that `cost' inherits from `price' in the standard-doc
architecture and from `value' in the alt-doc architecture.


All the best,


David

-- 
David Megginson                 ak117@freenet.carleton.ca
Microstar Software Ltd.         dmeggins@microstar.com
      http://home.sprynet.com/sprynet/dmeggins/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mwagner at ets.org  Fri Feb 20 20:33:47 1998
From: mwagner at ets.org (Mike Wagner)
Date: Mon Jun  7 17:00:10 2004
Subject: MS XML Parser on the Server
Message-ID: <v04003a04b11397b5aa53@[144.81.30.117]>

Has anybody managed to get the Microsoft Java XML Parser running as a
component accessible by ASP under IIS? I tried what seemed to me to be the
obvious approach and that didn't work. I copied the java classes to the
TrustLib directory, then registered them with javareg. (An excerpt of the
BAT I used file is at the end of this message). However, when I try a
simple Server.CreateObject("com.ms.xml.om.Document") call in an ASP page,
it dies with the following error:

Microsoft JScript runtime error '800a01ad'

Automation server can't create object

/xmltest.asp, line 14

Any insights? Thanks.

Mike Wagner
Educational Testing Service
mwagner@ets.org

-----------------Javareg BAT file--------------------
cd \winnt\java\trustlib\com\ms\xml\dso
javareg /register /class:SchemaNode /progid:com.ms.xml.dso.SchemaNode
cd \winnt\java\trustlib\com\ms\xml\dso
javareg /register /class:XMLDSO /progid:com.ms.xml.dso.XMLDSO
cd \winnt\java\trustlib\com\ms\xml\dso
javareg /register /class:XMLParserThread
/progid:com.ms.xml.dso.XMLParserThread
cd \winnt\java\trustlib\com\ms\xml\dso
javareg /register /class:XMLRowsetProvider
/progid:com.ms.xml.dso.XMLRowsetProvider
cd \winnt\java\trustlib\com\ms\xml\om


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From pierlou at CAM.ORG  Fri Feb 20 21:04:15 1998
From: pierlou at CAM.ORG (Pierre Morel)
Date: Mon Jun  7 17:00:10 2004
Subject: Automating Search Interfaces
Message-ID: <01bd3e41$861d1090$02dcdcdc@pierre>

Hello,

I would like to talk about the location of the person making the search versus the location of the product or service provider. If I search for a product and I want it now, I only want a list of provider in a distance applicable for my request. And if I go to Europe this summer and want to make reservation or search for activities occuring at that time, the 'where I am' specification change. If I have a secondary house and make request on the week-end, I want the restaurant in that region and not the one near my primary house. An identity profile should be include in the query and give the chance to the search engine to make a better choice in regard of my age, sex, etc...

Another part of the problem is a unique number identification and I am not sure if EAN or SIC is good for that purpose. How a search engine can parse a site or made a request for a product or service without a unique product number. A hotel room is a 'chambre' in french. If I search for a hotel room in Italy, I don't know the word for room in italian but if a room is a number, I can search for a room every where in the world. The query interface will be in my language and the service provider will build his database in his own language. The query page should change for every product. I have work around this idea for a time and came to the conclusion that a lightweight page creation and manipulation is need. The small tutorial that show how the parts fit together is related to a very premature search engine. The left pane show the products in a store but can be a list of products at a search engine site.

What is XML-Data versus DTD ? Maybe the solution is there and I don't see it.
I would like to know if every product on earth can have a number the same way that every book can be codified ?

Best regards to all

Pierre Morel 
pierlou@cam.org 
http://www.cam.org/~pierlou/prototype


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19980220/147dbc11/attachment.htm
From donpark at quake.net  Fri Feb 20 22:39:36 1998
From: donpark at quake.net (Don Park)
Date: Mon Jun  7 17:00:10 2004
Subject: TagNet (was Automating Search Interfaces)
Message-ID: <009301bd3e4f$b4e336d0$2ee044c6@donpark>

Matthew,

>One approach that really appeals to me is based on a two-pronged effort to
>create standard tags *and* standard DTDs, and relies on the fact that there
>is really a working mechanism for extending DTDs through inheritance (which
>I guess is still not entirely the case).

I think the efforts will be best spent by building a sort of WordNet like
service which allow automatic registration and association of tag and
attribute names.  For example, book vendor could register TITLE as a tag
name and associate it with NAME as a synonym constrained by the book
industry code (if there is such a thing).  Search service can then see that
the contents offered by the book vendor can be searched by mapping its NAME
field to TITLE tag.  Inheritance relationship can also be registered and
taken advantage of by search services.

It probably won't have to be a full semantic network but it will require a
standard API.  I wish it could capture whole/part relationships as well like
(NAME == FIRST + MIDDLE + LAST) but I could be going overboard here.  Some
of the entries can be marked as the 'norm' by some standardization
organizations.  A DTD writer could just build what he wants and then pass it
through the service to change all names to the 'norm'.

For the benefit of those replying to this message, let me call the service
TagNet.

"What do you want to tag today?;-)"

Feeling great today,

Don Park
http://www.quake.net/~donpark/index.html


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at quake.net  Fri Feb 20 22:39:40 1998
From: donpark at quake.net (Don Park)
Date: Mon Jun  7 17:00:10 2004
Subject: Automating Search Interfaces
Message-ID: <009401bd3e4f$b5d7a8f0$2ee044c6@donpark>

Pierre,

>I would like to talk about the location of the person making the search versus the location of the product or service provider. If I search for a product and I want it now, I only want a list of provider in a distance applicable for my request. And if I go to Europe this summer and want to make reservation or search for activities occuring at that time, the 'where I am' specification change. If I have a secondary house and make request on the week-end, I want the restaurant in that region and not the one near my primary house. An identity profile should be include in the query and give the chance to the search engine to make a better choice in regard of my age, sex, etc...

Interesting.  Some of the issues with product location are:

1. How to indicate location?

Address or map coordinates?  How does one find map coordinates?  What happens when he moves?

2. How to associate location with products?

If a vendor has all inventory at a single location then the location can be #FIXED in his DTD.  If inventory is distributed around the globe, each product or inventory group will have to be marked.  The problem is that now it makes no sense to indicate physical location.  It will have to be a store code which causes problem with search services since store codes will have to be converted into location format used by the search service.

As far as time constraints go, each product will probably be marked with time.  The problem is that some time constraints are relative in nature.

*Ouch*  I just thought of another painful problem with prices.  What happens when a store wants to put on a sale?  His database of products will have to map to different pricing schemes constrained by time, location, or association.

All this hurts my head a bit but it is very interesting indeed...

Regards,

Don Park
http://www.quake.net/~donpark/index.html

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19980220/5f733d95/attachment.htm
From mike at jmaca.com  Fri Feb 20 23:01:53 1998
From: mike at jmaca.com (Michael Emmel)
Date: Mon Jun  7 17:00:10 2004
Subject: Binary Data
Message-ID: <34EE0D80.BBA53DC9@jmaca.com>

Is it possible to include binary data in a XML document  and follow the
spec.

<![CDATA[ ascii data ]]>

allows the inclusion of arbitrary ascii data except I do not think
uuencode or other binary -> ascii/UTF8
encoders will work without modification to eliminate the ]]> encoding.

Would this be possible.

<![BDATA length=1024[ binary data ]]>   where the parser would ignore
1024 bytes and expect
to see a ]]> at the end.

The spec seems to imply only character data but does not disallow
binary data.

I assume a character encoding that did not use the ]]> sequence is okay.

I think the <![BDATA  length=x[     ]]> tag is not.
You need  let the the parser ignore and redirect x number of bytes from
the token stream.  This would be equivalent to a "Java production" in
Javacc.
But I'm not sure if it is legal ???

So do I need to alter uuencode or some other encoding format to fit the
<CDATA format
or is it legal to include a binary section. And if not why not : )

Mike

mike@jmaca.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From peter at ursus.demon.co.uk  Fri Feb 20 23:03:16 1998
From: peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun  7 17:00:10 2004
Subject: LISTRIVIA
In-Reply-To: <009401bd3e4f$b5d7a8f0$2ee044c6@donpark>
Message-ID: <3.0.1.16.19980220224802.35d7dc7c@pop3.demon.co.uk>

At 14:33 20/02/98 -0800, Don Park wrote:
[...]
>
>Attachment Converted: "c:\eudora\attach\ReAutoma.htm"
^^^^^^^^^^^^

This is the sort of problem with attachments...

	P.
>
Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic
net connection
VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary
http://www.venus.co.uk/vhg

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From peter at ursus.demon.co.uk  Fri Feb 20 23:27:09 1998
From: peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun  7 17:00:10 2004
Subject: LISTRIVIA
In-Reply-To: <01bd3e41$861d1090$02dcdcdc@pierre>
Message-ID: <3.0.1.16.19980220224826.35d7f436@pop3.demon.co.uk>

Hi Pierre, thanks for the posting...

At 15:52 20/02/98 -0500, Pierre Morel wrote:
>
>Attachment Converted: "c:\eudora\attach\Automati.htm"
^^^^^^^^^^
We ask people not to post attachments to xml-dev, because they don't get
hypermailed and they take up space on readers' machines :-)

	P.

Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic
net connection
VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary
http://www.venus.co.uk/vhg

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From peter at ursus.demon.co.uk  Fri Feb 20 23:39:55 1998
From: peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun  7 17:00:10 2004
Subject: xml:space
Message-ID: <3.0.1.16.19980220223432.3ea70dee@pop3.demon.co.uk>

I am considering how to treat xml:space in JUMBO and ask for help and
comments. <NOTE>I am NOT re-opening the whitespace debate; I am asking
those who understand xml:space if what I do/intend to do is reasonable.
xml:space is a formal part of the language and I feel I have to address
it.</NOTE>

1. Are there any documents which actually use xml:space?  rec.xml does not

2. Is there anyone on this list intending to use it? If so, what do they
expect "applications' default white-space processing modes" to be?

[Quotations are from rec.xml]

>An XML processor must always pass all characters in a document that are
not >markup through to the application. A validating XML
>processor must also inform the application which of these characters
>constitute white space appearing in element content. 

My philosophy in JUMBO (which is a generic application) is to accept all
whitespace from the parser/SAX, whether labelled 'ignorable' or not. All
PCDATA is stored in child nodes of elements. Those with ignorable
whitespace can be specially labelled.  IOW I do not discard any character
data on input.

>
>A special attribute named xml:space may be attached to an element to
signal >an intention that in that element, white space should be
>preserved by applications. In valid documents, this attribute, like any
>other, must be declared if it is used. When declared, it must be
>given as an enumerated type whose only possible values are "default" and
>"preserve". For example:
>
>      <!ATTLIST poem   xml:space (default|preserve) 'preserve'>
>

OK. If xml:space="preserve" I have no problems.
If xml:space="default" I am asking for help. Note that xml:space="default"
could apply either to ignorable whitespace or non-ignorable w/s
If xml:space is absent, I suggest options below...

>
>The value "default" signals that applications' default white-space
>processing modes are acceptable for this element; the value
>"preserve" indicates the intent that applications preserve all the white
>space. This declared intent is considered to apply to all
>elements within the content of the element where it is specified, unless 
This causes me slight concern. It means I have to write code that
automatically tracks what elements have an xml:space attribute. This is
possible, but yet another thing that has to be done. I might be motivated
to do it if I am shown some use for it...

>overriden with another instance of the xml:space attribute. 

This means effectively that every node in a document has to have an
xml:space flag. [Unless this is dynamically worked out every time the
document is to be rendered.]

--------

Without xml:space, and without a DTD, I can see the following *generic*
possibilities:
	- element is empty. [BTW the spec (and SAX) discards all knowledge of
whether this was created by <FOO></FOO> or <FOO/>. I approve of this.].
Children are not displayed because there aren't any
	- element contains non-w/s characters. This is displayed as either as a
string or as a title-value pair (at user option). The title is determined
by simple heuristics.
	- element contains element content. This is displayed as a tree. I am
considering also allowing the user to display this as a tagged/untagged
event stream, but the tree is the default.
	- element contains element content and (some) non-w/s PCDATA children .
This is displayed as an untagged (or selectable) tagged event stream.
Unless the semantics of the tags are known or a stylesheet is provided, no
other rendering is possible.

Now the two w/s options...
	- element contains element content and (only) w/s children. This is
displayed by default as ignoring the w/s. Note that this is *display*, not
processing. Since the default is a tree, the w/s nodes aren't much use.
	- element contains a single w/s child. This does not display anything by
default.

The user can switch to display/hide PCDATA children in the tree display.

For *outputting* it is possible to delete the w/s nodes if required. Once
deleted they are gone ...

I would be interested in comments as to whether this is reasonable default
behaviour or whether there are other things that should be considered.

	P.

Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic
net connection
VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary
http://www.venus.co.uk/vhg

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Sat Feb 21 00:24:39 1998
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:00:10 2004
Subject: xml:space
Message-ID: <3.0.32.19980220162318.00acf700@pop.intergate.bc.ca>

At 10:34 PM 2/20/98, Peter Murray-Rust wrote:

A short answer: yes, if you want to respect xml:space, you have really
no choice but to keep a stack or suchlike to see if it's been overriden
in a child element.  JUMBO, since it's an application, has no obligation
to respect xml:space, it's just a request, after all.  If you are 
respecting xml:space, whenever you are in an element for which 
xml:space='preserve' does not apply, you should do whatever best suits the 
needs of your application and its users.  I very much doubt there is a 
universal answer for all classes of application.  I think HTML gets it 
pretty much right for display type applications.

As for your question "will it be used?": yes, of course. -Tim


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ricko at allette.com.au  Sat Feb 21 02:44:14 1998
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun  7 17:00:11 2004
Subject: Binary Data
Message-ID: <002201bd3e72$c6880a00$9d0b4ccb@NT.JELLIFFE.COM.AU>


From: Michael Emmel <mike@jmaca.com>


>Is it possible to include binary data in a XML document  and follow the
>spec.


It is possible to have binary data in an XML *document* but it is not
possible
to have (unencoded) binary data in an XML text *entity*.  A document is
constructed from entities. An entity is usually a file. An entity is either
text
or binary (NDATA) but not both.

You can use Base64 encoding to stick non-text data inside elements:

<!DOCTYPE foo [
<!NOTATION base64 SYSTEM "put URL of base 64 code here, or omit this string"
...
]>
<foo>
...
<binary-data notation="base64">...</binary-data>
...
</foo>


CDATA marked sections are only a shorthand mechanism for data which has a
lot of
"&" or "<" characters which you might find tedious to delimit into entity
references.
It is not a mechanism for embedding raw binary, per se.

Rick Jelliffe


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From JimL at Alphag.net  Sat Feb 21 23:39:03 1998
From: JimL at Alphag.net (Jim Lears)
Date: Mon Jun  7 17:00:11 2004
Subject: MS XML Parser on the Server
Message-ID: <D00028C46B33D111BABF00A0249C714B09DD82@ROMULUS>

Server.CreateObject in VBScript is used for creating instances of COM
objects. The Java XML Parser doesn't expose any COM interfaces...notably
IClassFactory which is used to instantiate COM objects. The C++ version
is what you need...its an ActiveX control. The source code for both
parsers is available. If you insist on using the Java version, you could
mod it up to sport a COM interface..


Helping To Destroy The English Language

	-----Original Message-----
	From:	Mike Wagner [SMTP:mwagner@ets.org]
	Sent:	Friday, February 20, 1998 3:33 PM
	To:	xml-dev@ic.ac.uk
	Subject:	MS XML Parser on the Server

	Has anybody managed to get the Microsoft Java XML Parser running
as a
	component accessible by ASP under IIS? I tried what seemed to me
to be the
	obvious approach and that didn't work. I copied the java classes
to the
	TrustLib directory, then registered them with javareg. (An
excerpt of the
	BAT I used file is at the end of this message). However, when I
try a
	simple Server.CreateObject("com.ms.xml.om.Document") call in an
ASP page,
	it dies with the following error:

	Microsoft JScript runtime error '800a01ad'

	Automation server can't create object

	/xmltest.asp, line 14

	Any insights? Thanks.

	Mike Wagner
	Educational Testing Service
	mwagner@ets.org

	-----------------Javareg BAT file--------------------
	cd \winnt\java\trustlib\com\ms\xml\dso
	javareg /register /class:SchemaNode
/progid:com.ms.xml.dso.SchemaNode
	cd \winnt\java\trustlib\com\ms\xml\dso
	javareg /register /class:XMLDSO /progid:com.ms.xml.dso.XMLDSO
	cd \winnt\java\trustlib\com\ms\xml\dso
	javareg /register /class:XMLParserThread
	/progid:com.ms.xml.dso.XMLParserThread
	cd \winnt\java\trustlib\com\ms\xml\dso
	javareg /register /class:XMLRowsetProvider
	/progid:com.ms.xml.dso.XMLRowsetProvider
	cd \winnt\java\trustlib\com\ms\xml\om


	xml-dev: A list for W3C XML Developers. To post,
mailto:xml-dev@ic.ac.uk
	Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
	To (un)subscribe, mailto:majordomo@ic.ac.uk the following
message;
	(un)subscribe xml-dev
	To subscribe to the digests, mailto:majordomo@ic.ac.uk the
following message;
	subscribe xml-dev-digest
	List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mike at datachannel.com  Sun Feb 22 00:05:46 1998
From: mike at datachannel.com (Mike Dierken)
Date: Mon Jun  7 17:00:11 2004
Subject: MS XML Parser on the Server
Message-ID: <01BD3EE2.0FB98770@NEMO>

On the MS platform, you can expose all your Java classes and interfaces as COM interfaces if you use the ActiveX Wizard for Java (JAVAIDL.EXE). It'll create an .IDL file (and  .C and .H files if you want to call the interfaces from C/C++).
All Java classes are exposed asl dual interfaces, derived from IDispatch, which allows them to be called from all COM aware scripting languages (JavaScript, VB for Automation, etc).

If the Java classes are registered with Javareg (using the CLSIDs from the generated .IDL file) on the server, you can use the package name rather than a CLSID.
To create a Java object, you might try prepending 'java:' on the package name.
	Server.CreateObject("java:com.ms.xml.om.Document") 

Hope this helps...

Mike D
DataChannel

-----Original Message-----
From:	Jim Lears [SMTP:JimL@Alphag.net]
Sent:	Saturday, February 21, 1998 3:36 PM
To:	xml-dev@ic.ac.uk
Subject:	RE: MS XML Parser on the Server

Server.CreateObject in VBScript is used for creating instances of COM
objects. The Java XML Parser doesn't expose any COM interfaces...notably
IClassFactory which is used to instantiate COM objects. The C++ version
is what you need...its an ActiveX control. The source code for both
parsers is available. If you insist on using the Java version, you could
mod it up to sport a COM interface..


Helping To Destroy The English Language

	-----Original Message-----
	From:	Mike Wagner [SMTP:mwagner@ets.org]
	Sent:	Friday, February 20, 1998 3:33 PM
	To:	xml-dev@ic.ac.uk
	Subject:	MS XML Parser on the Server

	Has anybody managed to get the Microsoft Java XML Parser running
as a
	component accessible by ASP under IIS? I tried what seemed to me
to be the
	obvious approach and that didn't work. I copied the java classes
to the
	TrustLib directory, then registered them with javareg. (An
excerpt of the
	BAT I used file is at the end of this message). However, when I
try a
	simple Server.CreateObject("com.ms.xml.om.Document") call in an
ASP page,
	it dies with the following error:

	Microsoft JScript runtime error '800a01ad'

	Automation server can't create object

	/xmltest.asp, line 14

	Any insights? Thanks.

	Mike Wagner
	Educational Testing Service
	mwagner@ets.org

	-----------------Javareg BAT file--------------------
	cd \winnt\java\trustlib\com\ms\xml\dso
	javareg /register /class:SchemaNode
/progid:com.ms.xml.dso.SchemaNode
	cd \winnt\java\trustlib\com\ms\xml\dso
	javareg /register /class:XMLDSO /progid:com.ms.xml.dso.XMLDSO
	cd \winnt\java\trustlib\com\ms\xml\dso
	javareg /register /class:XMLParserThread
	/progid:com.ms.xml.dso.XMLParserThread
	cd \winnt\java\trustlib\com\ms\xml\dso
	javareg /register /class:XMLRowsetProvider
	/progid:com.ms.xml.dso.XMLRowsetProvider
	cd \winnt\java\trustlib\com\ms\xml\om


	xml-dev: A list for W3C XML Developers. To post,
mailto:xml-dev@ic.ac.uk
	Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
	To (un)subscribe, mailto:majordomo@ic.ac.uk the following
message;
	(un)subscribe xml-dev
	To subscribe to the digests, mailto:majordomo@ic.ac.uk the
following message;
	subscribe xml-dev-digest
	List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From k_coffin at conknet.com  Sun Feb 22 02:58:22 1998
From: k_coffin at conknet.com (Kerry Coffin)
Date: Mon Jun  7 17:00:11 2004
Subject: Binary Data
Message-ID: <01bd3f3d$977564d0$ed0620ce@lbynum.esri.com>

What is Base64?

Regards,
Kerry Coffin
Environmental Systems Research Institute (ESRI)

-----Original Message-----
From: Rick Jelliffe <ricko@allette.com.au>
To: Michael Emmel <mike@jmaca.com>; xml-dev@ic.ac.uk <xml-dev@ic.ac.uk>
Date: Friday, February 20, 1998 9:44 PM
Subject: Re: Binary Data


>
>
>From: Michael Emmel <mike@jmaca.com>
>
>
>
>>Is it possible to include binary data in a XML document  and follow the
>>spec.
>
>
>It is possible to have binary data in an XML *document* but it is not
>possible
>to have (unencoded) binary data in an XML text *entity*.  A document is
>constructed from entities. An entity is usually a file. An entity is either
>text
>or binary (NDATA) but not both.
>
>You can use Base64 encoding to stick non-text data inside elements:
>
><!DOCTYPE foo [
><!NOTATION base64 SYSTEM "put URL of base 64 code here, or omit this
string"
>...
>]>
><foo>
>...
><binary-data notation="base64">...</binary-data>
>...
></foo>
>
>
>CDATA marked sections are only a shorthand mechanism for data which has a
>lot of
>"&" or "<" characters which you might find tedious to delimit into entity
>references.
>It is not a mechanism for embedding raw binary, per se.
>
>Rick Jelliffe
>
>
>xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
>Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
>To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
>(un)subscribe xml-dev
>To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
>subscribe xml-dev-digest
>List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From peter at ursus.demon.co.uk  Sun Feb 22 10:57:37 1998
From: peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun  7 17:00:11 2004
Subject: LISTRIVIA
In-Reply-To: <01BD3EE2.0FB98770@NEMO>
Message-ID: <3.0.1.16.19980222103123.1c3f3e98@pop3.demon.co.uk>

At 16:02 21/02/98 -0800, [a number of posters in combination] wrote:

[A message]

>
>-----Original Message-----

[which quoted another message in full]

>
>	-----Original Message-----
[which itself quoted another message in full]

[and finished with cascading xml-dev backmatter].

and in another message a simple question was asked followed by cascading
quoted messages which added no value.

----------------------------------------------------------------------

Since new members are continually joining the list - and we welcome them
:-) - , I'll reiterate our policy for minimising the amount of material
posted. Remember that:
	- many people pay personal money for mail (including me)
	- duplicated material is excessively tedious on the hypermail list and
takes up valuable space
	- duplication takes up space on reader's local storage.
	- automatic quoting is not a good approach towards managing information.
XML encourages people to normalise material as much as possible.

Please therefore excise all material that you don't directly refer to in
your message. Most people prefer to see the quoted material followed by the
annotation rather than the annotation followed by the original message.
Remember that the material is all hypermailed and publicly visible and
(optionally) available as a digest. Both of these should be attractive to
read. :-)

For more details and suggestions of other styles to adopt/avoid, you may
wish  to follow the various LISTRIVIA threads. These also comment on
multiple copies of postings :-)

	P.
Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic
net connection
VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary
http://www.venus.co.uk/vhg

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at quake.net  Sun Feb 22 21:05:58 1998
From: donpark at quake.net (Don Park)
Date: Mon Jun  7 17:00:11 2004
Subject: Binary Data
Message-ID: <003101bd3fd4$f5c83fc0$2ee044c6@donpark>

BASE64 is MIME content tranfer encoding algorithm defined in RFC 2045.  It
is used to map binary data into a range of characters.

Don Park
http://www.quake.net/~donpark/index.html


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Mon Feb 23 00:50:34 1998
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:00:11 2004
Subject: Binary Data
Message-ID: <3.0.32.19980222164918.00b68370@pop.intergate.bc.ca>

At 12:59 PM 2/22/98 -0800, Don Park wrote:
>BASE64 is MIME content tranfer encoding algorithm defined in RFC 2045.  It
>is used to map binary data into a range of characters.

What's real important from the XML point of view is that (unless my
memory fails me) base64 has the nice property that it uses a very
restricted range of characters, which happens not to include < or &,
and thus can be tossed into an XML doc just about anywhere without
breaking anything.  I think a predefined base64 notation attribute
is a no-brainer good idea, so obvious that it can't be new.  -Tim


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From b.laforge at opengroup.org  Mon Feb 23 00:59:49 1998
From: b.laforge at opengroup.org (Bill la Forge)
Date: Mon Jun  7 17:00:11 2004
Subject: xml-based protocol
Message-ID: <3.0.32.19980222200447.00a05330@postman.osf.org>

Finally, AXTP is using xml for the wire protocol.
(I've also created some documentation.)

AXTP: Application eXtensible Transactional Protocol (UDP based)
http://www.camb.opengroup.org/~laforge/axtp/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ak117 at freenet.carleton.ca  Mon Feb 23 03:14:26 1998
From: ak117 at freenet.carleton.ca (David Megginson)
Date: Mon Jun  7 17:00:11 2004
Subject: SAX: finalising org.sax.xml.Parser
Message-ID: <199802230313.WAA00386@unready.microstar.com>

It's time to finalise SAX before there is such a big code base that we
can no longer make changes.  (Thanks, by the way, to James Clark,
DataChannel, and IBM for including native SAX support in their XML
parsers).  During this phase, I'd like to make the _minimum_ changes
necessary SAX to define a consistent and simple common functionality
for XML parsers.

Let's start with the Parser interface.  I'll use Java syntax because,
while I can read IDL, I don't trust myself to write it:


[current interface]
------------------------------------------------------------------------
  package org.xml.sax;

  public interface Parser {

   public void setEntityHandler (EntityHandler handler);
   public void setDocumentHandler (DocumentHandler handler);
   public void setErrorHandler (ErrorHandler handler);

   public void parse (String publicID, String systemID)
     throws java.lang.Exception;

  }
------------------------------------------------------------------------


After considering the various discussions over the past few weeks, I
propose that we make the following changes:

1) Add a parse() method that accepts a stream.

2) Add a parse() method that accepts a character buffer.

3) Remove public ID from the current parse() method (I don't think
   public IDs are going anywhere fast in XML).

With these changes, the interface would look like this in Java:


[proposed changes]
------------------------------------------------------------------------
  package org.xml.sax;
  import java.io.InputStream;

  public interface Parser {

   public void setEntityHandler (EntityHandler handler);
   public void setDocumentHandler (DocumentHandler handler);
   public void setErrorHandler (ErrorHandler handler);

   public void parse (String uri)
     throws java.lang.Exception;
   public void parse (InputStream is, String baseURI)
     throws java.lang.Exception;
   public void parse (char ch[], int start, int length, String baseURI)
     throws java.lang.Exception;

  }
------------------------------------------------------------------------


NOTES:

a. The baseURI argument is necessary for streams and character buffers
   in case either contains a relative URI.  You can supply a null
   value if the document entity will not contain relative URIs.

b. All programming languages initially targeted by SAX (Java, C++, C,
   Perl) have some concept of input streams; if we come up against one
   that doesn't, it can simply omit the relevant method.

c. The start and length arguments are necessary with the character
   buffer in case the XML document is part of a larger array.


Does this give reasonable functionality without limiting the
architectural approaches of parser writers?  Remember that individual
implementations can extend this interface, but the interface
represents the minimum common functionality that every SAX-conformant
parser (eventually) provides.


Thanks, and all the best,


David

-- 
David Megginson                 ak117@freenet.carleton.ca
Microstar Software Ltd.         dmeggins@microstar.com
      http://home.sprynet.com/sprynet/dmeggins/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From zmin at iti.gov.sg  Mon Feb 23 05:08:40 1998
From: zmin at iti.gov.sg (Dr. Zheng Min)
Date: Mon Jun  7 17:00:11 2004
Subject: Making COM componts from java MSXML (Was: MS XML Parser on the Server)
Message-ID: <01bd4019$733f24c0$96897ac0@zhengmin.iti.gov.sg>

A few questions about making java MSXML COM aware:

1. Mike suggested using ActiveX Wizard for Java to create .IDL file. Has
anyone done it successfully? I tried it just now but a lot of method were
skipped because of non-translatible type (why is that? Does it mean those
methods can't be used in COM interface?).

2. Even worse, I can't re-compile MSXML in J++. I stuck in the first file --
com.ms.xml.dso.XMLDSO.java. The error messages are all in the same type:
        Value for argument 'parent' cannot be converted from 'int' in call
to 'Element ElementFactory.createElement(Element parent, int type, Name tag,
String text)'

The statement in XMLDSO.java is:
    e = factory.createElement(Element.ELEMENT,
XMLRowsetProvider.nameROWSET);
It doesn't look right but I don't know how MS can make *.class from it (or I
missed something?).

 Has anyone tried to recompile it and succeeded.

Thank,
Min


-----Original Message-----
From: Mike Dierken <mike@datachannel.com>
To: 'Jim Lears' <JimL@Alphag.net>; xml-dev@ic.ac.uk <xml-dev@ic.ac.uk>
Date: Sunday, February 22, 1998 8:03 AM
Subject: RE: MS XML Parser on the Server


>On the MS platform, you can expose all your Java classes and interfaces as
COM interfaces if you use the ActiveX Wizard for Java (JAVAIDL.EXE). It'll
create an .IDL file (and  .C and .H files if you want to call the interfaces
from C/C++).
>All Java classes are exposed asl dual interfaces, derived from IDispatch,
which allows them to be called from all COM aware scripting languages
(JavaScript, VB for Automation, etc).
>
>If the Java classes are registered with Javareg (using the CLSIDs from the
generated .IDL file) on the server, you can use the package name rather than
a CLSID.
>To create a Java object, you might try prepending 'java:' on the package
name.
> Server.CreateObject("java:com.ms.xml.om.Document")
>
>Hope this helps...
>
>Mike D
>DataChannel
>
>-----Original Message-----
>From: Jim Lears [SMTP:JimL@Alphag.net]
>Sent: Saturday, February 21, 1998 3:36 PM
>To: xml-dev@ic.ac.uk
>Subject: RE: MS XML Parser on the Server
>
>Server.CreateObject in VBScript is used for creating instances of COM
>objects. The Java XML Parser doesn't expose any COM interfaces...notably
>IClassFactory which is used to instantiate COM objects. The C++ version
>is what you need...its an ActiveX control. The source code for both
>parsers is available. If you insist on using the Java version, you could
>mod it up to sport a COM interface..
>
>
>Helping To Destroy The English Language
>
> -----Original Message-----
> From: Mike Wagner [SMTP:mwagner@ets.org]
> Sent: Friday, February 20, 1998 3:33 PM
> To: xml-dev@ic.ac.uk
> Subject: MS XML Parser on the Server
>
> Has anybody managed to get the Microsoft Java XML Parser running
>as a
> component accessible by ASP under IIS? I tried what seemed to me
>to be the
> obvious approach and that didn't work. I copied the java classes
>to the
> TrustLib directory, then registered them with javareg. (An
>excerpt of the
> BAT I used file is at the end of this message). However, when I
>try a
> simple Server.CreateObject("com.ms.xml.om.Document") call in an
>ASP page,
> it dies with the following error:
>
> Microsoft JScript runtime error '800a01ad'
>
> Automation server can't create object
>
> /xmltest.asp, line 14
>
> Any insights? Thanks.
>
> Mike Wagner
> Educational Testing Service
> mwagner@ets.org
>
> -----------------Javareg BAT file--------------------
> cd \winnt\java\trustlib\com\ms\xml\dso
> javareg /register /class:SchemaNode
>/progid:com.ms.xml.dso.SchemaNode
> cd \winnt\java\trustlib\com\ms\xml\dso
> javareg /register /class:XMLDSO /progid:com.ms.xml.dso.XMLDSO
> cd \winnt\java\trustlib\com\ms\xml\dso
> javareg /register /class:XMLParserThread
> /progid:com.ms.xml.dso.XMLParserThread
> cd \winnt\java\trustlib\com\ms\xml\dso
> javareg /register /class:XMLRowsetProvider
> /progid:com.ms.xml.dso.XMLRowsetProvider
> cd \winnt\java\trustlib\com\ms\xml\om
>
>
>
> xml-dev: A list for W3C XML Developers. To post,
>mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following
>message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the
>following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>
>xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
>Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
>To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
>(un)subscribe xml-dev
>To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
>subscribe xml-dev-digest
>List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>
>
>xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
>Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
>To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
>(un)subscribe xml-dev
>To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
>subscribe xml-dev-digest
>List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From M.H.Kay at eng.icl.co.uk  Mon Feb 23 11:25:19 1998
From: M.H.Kay at eng.icl.co.uk (Michael Kay)
Date: Mon Jun  7 17:00:11 2004
Subject: MS XML Parser on the Server
Message-ID: <01bd404d$ad63cbe0$1e09e391@mhklaptop.bra01.icl.co.uk>

>Has anybody managed to get the Microsoft Java XML Parser running as a
>component accessible by ASP under IIS?

I tried and failed, probably because I was doing it wrong; then I rewrote
my app using SAX (over AElfred) and have this working under ASP fine.
I tried first using Javasoft's ActiveX Bridge which I couldn't get to work
except for the most trivial single-class javabeans; then I tried using
javareg and got it working - at least once I had worked out how to ensure
that the class path setting for the Microsoft Java VM was right. I found it
useful to
test the thing with a little VB app as the environment is more controllable.
I found it necessary to pay some attention to exception handling: if you
don't catch the things, they have a habit of crashing the ActiveX container,
i.e. the web server.

To keep things simple, I wrote a simple wrapper class for my application
which exposed all the interfaces I needed in the ASP script and nothing
else, and it was this wrapper class that I registered using javareg. The
underlying Java classes, so long as they are on the classpath, do not need
to be registered.

My javareg call was

javareg /register /class:com.icl.saxon.showXML /progid:ShowXML.Java

and the CreateDocument (in VBScript) was:

Set app = CreateObject("ShowXML.Java")

I haven't tried calling back from the Java code to ActiveX objects (e.g.
calling Response.Write) but it should work in theory. Instead I put the
output in String variables which the ASP page retrieves explicitly using
methods on ShowXML. Not elegant, but I was deliberately minimising the
number of things that might go wrong. I also haven't tried anything
complicated with collections or enumerations.

Hope that helps,

Mike Kay, ICL


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From M.H.Kay at eng.icl.co.uk  Mon Feb 23 11:48:07 1998
From: M.H.Kay at eng.icl.co.uk (Michael Kay)
Date: Mon Jun  7 17:00:11 2004
Subject: Automating Search Interfaces
Message-ID: <01bd4050$b9604f60$1e09e391@mhklaptop.bra01.icl.co.uk>

>I would like to talk about the location of the person making the search versus >the location of the product or service provider
    
    Geographic/Spatial queries are a well-researched topic in the database literature. Free text retrieval is definitely a weak approach, though people attempt it by using thesaurus facilities to represent the structure of a gazetteer. In most of the practical systems I have seen, spatial query is done using postal codes: the system needs knowledge of which postal districts are near each other. (We also use such techniques for scheduling the itinerary of service engineers).
     
    >A hotel room is a 'chambre' in french. If I search for a hotel room in Italy, I>don't know the word for room in italian...
    
    Multilingual search is well researched and seems to work reasonably well. The more difficult problem is to distinguish agencies that can book you a hotel room from newsletter articles by people enthusing what a wonderful hotel room they were staying in: I think this is why there will always be added value in manual categorization and indexing services.
    
    Mike Kay
     
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19980223/d9793ec6/attachment.htm
From hb at ix.heise.de  Mon Feb 23 13:01:46 1998
From: hb at ix.heise.de (Henning Behme)
Date: Mon Jun  7 17:00:11 2004
Subject: Ad: small app + article (XML/DSSSL).
References: <3.0.1.16.19980214135027.63dfb77c@pop3.demon.co.uk>
Message-ID: <34F172D9.15712E4A@ix.heise.de>

Hi,

we (iX Magazine  in Germany) have put an article online (in German,
though - I'll try to provide an English version asap) that introduces a
small XML application and shows how its data is being converted into
HTML using James Clark's Jade. The app is a tiny attempt to display
literary history in terms of authors (when born &c.) and explains two
DSSSl style sheets which a) show the toc and b) list details of a
(chosen) author. Those of you who read German may try (if interested :-)

http://www.heise.de/ix/artikel/1998/03/156/

The app itself is online, too (toc and single author by now; I am
working on century-oriented lists and the like)

http://www.heise.de/ix/raven/Web/xml/lit

toc is static, author is done on the fly using Jade. I thought it would
be better this way than to generate all the files for the authors,
although this, of course, means waiting for a short while :-)

Best regards,

hb

--
Henning Behme

iX - Magazin fuer professionelle Informationstechnik
Helstorfer Str. 7 * 30625 Hannover * Germany
http://www.heise.de/ix/ * +49 511 5352-374 * -361 (Fax)
------ White, adj. and n. Black  (Ambrose Bierce) ------


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From gmckenzi at JetForm.com  Mon Feb 23 14:48:18 1998
From: gmckenzi at JetForm.com (Gavin McKenzie)
Date: Mon Jun  7 17:00:11 2004
Subject: finalising org.sax.xml.Parser
Message-ID: <c=CA%a=_%p=JetForm%l=ROSSINI-980223144227Z-20917@rossini.jetform.com>


David,

While PUBLIC may not be going anywhere fast, I'd prefer that the parse()
call-level support for it be left in SAX.  I intend to make ad-hoc use
of it internally (rolling my own catalogs and such).

I support your other proposed additions to the interface.

Gavin.

>-----Original Message-----
>From:	David Megginson [SMTP:ak117@freenet.carleton.ca]
>Sent:	Sunday, February 22, 1998 10:13 PM
>To:	xml-dev Mailing List
>Subject:	SAX: finalising org.sax.xml.Parser
>
>It's time to finalise SAX before there is such a big code base that we
>can no longer make changes.  (Thanks, by the way, to James Clark,
>DataChannel, and IBM for including native SAX support in their XML
>parsers).  During this phase, I'd like to make the _minimum_ changes
>necessary SAX to define a consistent and simple common functionality
>for XML parsers.
>
>Let's start with the Parser interface.  I'll use Java syntax because,
>while I can read IDL, I don't trust myself to write it:
>
>
>[current interface]
>------------------------------------------------------------------------
>  package org.xml.sax;
>
>  public interface Parser {
>
>   public void setEntityHandler (EntityHandler handler);
>   public void setDocumentHandler (DocumentHandler handler);
>   public void setErrorHandler (ErrorHandler handler);
>
>   public void parse (String publicID, String systemID)
>     throws java.lang.Exception;
>
>  }
>------------------------------------------------------------------------
>
>
>After considering the various discussions over the past few weeks, I
>propose that we make the following changes:
>
>1) Add a parse() method that accepts a stream.
>
>2) Add a parse() method that accepts a character buffer.
>
>3) Remove public ID from the current parse() method (I don't think
>   public IDs are going anywhere fast in XML).
>
>With these changes, the interface would look like this in Java:
>
>
>[proposed changes]
>------------------------------------------------------------------------
>  package org.xml.sax;
>  import java.io.InputStream;
>
>  public interface Parser {
>
>   public void setEntityHandler (EntityHandler handler);
>   public void setDocumentHandler (DocumentHandler handler);
>   public void setErrorHandler (ErrorHandler handler);
>
>   public void parse (String uri)
>     throws java.lang.Exception;
>   public void parse (InputStream is, String baseURI)
>     throws java.lang.Exception;
>   public void parse (char ch[], int start, int length, String baseURI)
>     throws java.lang.Exception;
>
>  }
>------------------------------------------------------------------------
>
>
>NOTES:
>
>a. The baseURI argument is necessary for streams and character buffers
>   in case either contains a relative URI.  You can supply a null
>   value if the document entity will not contain relative URIs.
>
>b. All programming languages initially targeted by SAX (Java, C++, C,
>   Perl) have some concept of input streams; if we come up against one
>   that doesn't, it can simply omit the relevant method.
>
>c. The start and length arguments are necessary with the character
>   buffer in case the XML document is part of a larger array.
>
>
>Does this give reasonable functionality without limiting the
>architectural approaches of parser writers?  Remember that individual
>implementations can extend this interface, but the interface
>represents the minimum common functionality that every SAX-conformant
>parser (eventually) provides.
>
>
>Thanks, and all the best,
>
>
>David
>
>-- 
>David Megginson                 ak117@freenet.carleton.ca
>Microstar Software Ltd.         dmeggins@microstar.com
>      http://home.sprynet.com/sprynet/dmeggins/
>
>xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
>Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
>To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
>(un)subscribe xml-dev
>To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
>subscribe xml-dev-digest
>List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mecom-gmbh at mixx.de  Mon Feb 23 14:57:29 1998
From: mecom-gmbh at mixx.de (james anderson)
Date: Mon Jun  7 17:00:11 2004
Subject: xml-based protocol (axtp)
References: <3.0.32.19980222200447.00a05330@postman.osf.org>
Message-ID: <34F18E5E.AB73276E@mixx.de>

this (and the object stream <-> xml conversion) looks interesting. is there a
tar/zipped/...'d version anywhere.

Bill la Forge wrote:

> AXTP: Application eXtensible Transactional Protocol (UDP based)
> http://www.camb.opengroup.org/~laforge/axtp/


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tyler at infinet.com  Mon Feb 23 15:28:08 1998
From: tyler at infinet.com (Tyler Baker)
Date: Mon Jun  7 17:00:11 2004
Subject: xml-based protocol
References: <3.0.32.19980222200447.00a05330@postman.osf.org>
Message-ID: <34F19695.17905F99@infinet.com>

Bill la Forge wrote:

> Finally, AXTP is using xml for the wire protocol.
> (I've also created some documentation.)
>
> AXTP: Application eXtensible Transactional Protocol (UDP based)
> http://www.camb.opengroup.org/~laforge/axtp/

This looks interesting except that the TransactionFactory interface has some
ridiculous names for the methods like createA(), createN(), etc. etc.  For one
simple interface, I think that worrying about class file size is a waste of time
when compared to having methods and constants which are readable and
understandable.

Tyler


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at quake.net  Mon Feb 23 15:29:47 1998
From: donpark at quake.net (Don Park)
Date: Mon Jun  7 17:00:11 2004
Subject: finalising org.sax.xml.Parser
Message-ID: <001801bd406f$18cbd140$2ee044c6@donpark>

David,

I agree with most of the changes especially the KISS solution to multiple
input type problem.

I have just two recommendations:

1. Keep Public ID.
2. Use System ID instead of Public ID.

End result is that we just have two new methods in Parser and no change to
existing methods.

My reasons are:

1. Who knows where that rubber chicken will come in handy?
2. It is trivial for a SAX parser implementor to extract baseURI from URI.
3. It is not trivial and rather confusing for a SAX user to figure out what
the base URI is.

So the method signatures would be:

    public void
parse (String pubID, String sysID)
    throws java.lang.Exception;

    public void
parse (String pubID, String sysID, InputStream is)
    throws java.lang.Exception;

    public void
parse (String pubID, String sysID, char ch[], int offset, int length)
    throws java.lang.Exception;

PS: Parameter orders were changed because I prefer to append new arguments
rather prepending.

For the new methods, pubID and sysID are used to tell the parser that "data
from the given stream or character array should be treated as if it came
from given pubID and sysID".

Regards,

Don


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jmj at thomtech.com  Mon Feb 23 16:02:42 1998
From: jmj at thomtech.com (jmj@thomtech.com)
Date: Mon Jun  7 17:00:11 2004
Subject: MS XML Parser on the Server
Message-ID: <9802238882.AA888249744@ccgate.thomtech.com>


     Greetings!
     
     So where would I find the source code for the C++ version?  I haven't 
     been able to find it at the microsoft site.
     
     Thanks!
     
     --Jim Jordan
     jmj@thomtech.com -- Thomson Technologies Lab Group
     From the sublime to the ridiculous is but a step.
      Napoleon Bonaparte - on the retreat from Moscow


______________________________ Reply Separator _________________________________
Subject: RE: MS XML Parser on the Server
Author:  Jim Lears <JimL@Alphag.net> at internet
Date:    2/21/98 6:35 PM


Server.CreateObject in VBScript is used for creating instances of COM 
objects. The Java XML Parser doesn't expose any COM interfaces...notably 
IClassFactory which is used to instantiate COM objects. The C++ version 
is what you need...its an ActiveX control. The source code for both 
parsers is available. If you insist on using the Java version, you could 
mod it up to sport a COM interface..
     
     
Helping To Destroy The English Language
     
        -----Original Message-----
        From:        Mike Wagner [SMTP:mwagner@ets.org] 
        Sent:        Friday, February 20, 1998 3:33 PM 
        To:        xml-dev@ic.ac.uk
        Subject:        MS XML Parser on the Server
     
        Has anybody managed to get the Microsoft Java XML Parser running
as a
        component accessible by ASP under IIS? I tried what seemed to me
to be the
        obvious approach and that didn't work. I copied the java classes
to the
        TrustLib directory, then registered them with javareg. (An
excerpt of the
        BAT I used file is at the end of this message). However, when I
try a
        simple Server.CreateObject("com.ms.xml.om.Document") call in an
ASP page,
        it dies with the following error:
     
        Microsoft JScript runtime error '800a01ad'
     
        Automation server can't create object
     
        /xmltest.asp, line 14
     
        Any insights? Thanks.
     
        Mike Wagner
        Educational Testing Service
        mwagner@ets.org
     
        -----------------Javareg BAT file-------------------- 
        cd \winnt\java\trustlib\com\ms\xml\dso
        javareg /register /class:SchemaNode
/progid:com.ms.xml.dso.SchemaNode
        cd \winnt\java\trustlib\com\ms\xml\dso
        javareg /register /class:XMLDSO /progid:com.ms.xml.dso.XMLDSO 
        cd \winnt\java\trustlib\com\ms\xml\dso
        javareg /register /class:XMLParserThread 
        /progid:com.ms.xml.dso.XMLParserThread 
        cd \winnt\java\trustlib\com\ms\xml\dso
        javareg /register /class:XMLRowsetProvider 
        /progid:com.ms.xml.dso.XMLRowsetProvider 
        cd \winnt\java\trustlib\com\ms\xml\om
     
     
        xml-dev: A list for W3C XML Developers. To post,
mailto:xml-dev@ic.ac.uk
        Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ 
        To (un)subscribe, mailto:majordomo@ic.ac.uk the following
message;
        (un)subscribe xml-dev
        To subscribe to the digests, mailto:majordomo@ic.ac.uk the
following message;
        subscribe xml-dev-digest
        List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
     
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk 
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; 
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; 
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
     
     
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ak117 at freenet.carleton.ca  Mon Feb 23 16:10:40 1998
From: ak117 at freenet.carleton.ca (David Megginson)
Date: Mon Jun  7 17:00:11 2004
Subject: finalising org.sax.xml.Parser
In-Reply-To: <001801bd406f$18cbd140$2ee044c6@donpark>
References: <001801bd406f$18cbd140$2ee044c6@donpark>
Message-ID: <199802231609.LAA01939@unready.microstar.com>

Don Park writes:

 > I agree with most of the changes especially the KISS solution to multiple
 > input type problem.
 > 
 > I have just two recommendations:
 > 
 > 1. Keep Public ID.
 > 2. Use System ID instead of Public ID.

That's two votes for keeping Public ID (and one for sticking with the
standard terminology for system IDs, instead of using the
Web-hacker-friendly "URI").  I would have no problem going with Don's
proposal, especially since it is identical to my discarded first
draft -- would anyone prefer _not_ to see public IDs, then?


All the best,


David

-- 
David Megginson                 ak117@freenet.carleton.ca
Microstar Software Ltd.         dmeggins@microstar.com
      http://home.sprynet.com/sprynet/dmeggins/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mike at jmaca.com  Mon Feb 23 16:16:21 1998
From: mike at jmaca.com (Michael Emmel)
Date: Mon Jun  7 17:00:11 2004
Subject: Binary Data
References: <01bd3f3d$977564d0$ed0620ce@lbynum.esri.com>
Message-ID: <34F1A319.49A499DE@jmaca.com>

 Okay  I read the spec better now that someone methiond NDATA and I undertstand
how
the  unparsed entity works.
What I still do not understand and it seems to be
undefinded is how the parser is restarted once and application consumes
a unparsed entity. At least for me.


                ExternalID ::= 'SYSTEM' S SystemLiteral | 'PUBLIC' S
PubidLiteral S SystemLiteral
                 NDataDecl::= S 'NDATA' S Name  [  VC: Notation Declared  ]

Hers the description of a VC

Validity Constraint: Notation Declared
          The Name must match the declared name of a notation.
            The SystemLiteral is called the entity's system identifier. It is a
URI, which may be used to retrieve the entity.
          Note that the hash mark (#) and fragment identifier frequently used
with URIs are not, formally, part of the URI
          itself; an XML processor may signal an error if a fragment identifier
is given as part of a system identifier. Unless
          otherwise provided by information outside the scope of this
specification (e.g. a special XML element type defined
          by a particular DTD, or a processing instruction defined by a
particular application specification), relative URIs
          are relative to the location of the resource within which the entity
declaration occurs. A URI might thus be relative
          to the document entity, to the entity containing the external DTD
subset, or to some other external parameter
          entity.
            An XML processor should handle a non-ASCII character in a URI by
representing the character in UTF-8 as
          one or more bytes, and then escaping these bytes with the URI
escaping mechanism (i.e., by converting each byte
          to %HH, where HH is the hexadecimal notation of the byte value).
            In addition to a system identifier, an external identifier may
include a public identifier. An XML processor
          attempting to retrieve the entity's content may use the public
identifier to try to generate an alternative URI. If the
          processor is unable to do so, it must use the URI specified in the
system literal. Before a match is attempted, all
          strings of white space in the public identifier must be normalized to
single space characters (#x20), and leading
          and trailing white space must be removed.
            Examples of external entity declarations:


and  here are some examples

<!ENTITY open-hatch
                    SYSTEM
"http://www.textuality.com/boilerplate/OpenHatch.xml">
           <!ENTITY open-hatch
                    PUBLIC "-//Textuality//TEXT Standard open-hatch
boilerplate//EN"
                    "http://www.textuality.com/boilerplate/OpenHatch.xml">
           <!ENTITY hatch-pic
                    SYSTEM "../grafix/OpenHatch.gif"
                    NDATA gif >


This says to me that binary data is required to either be encoded to ascii to
be included,
or have Mime type boundries for XML tags with  binary data not containing the
mime boundries included.
In the document or be obtained from a ascii normalized external URI link.
There is no way to tell a XML arser to skip x number of  arbitrary bytes of
embedded unparsed entity  data which is consumed by the "application"  and then
restart the parser
at the next valid section.

Am I wrong ???

Mike


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From M.H.Kay at eng.icl.co.uk  Mon Feb 23 16:48:47 1998
From: M.H.Kay at eng.icl.co.uk (Michael Kay)
Date: Mon Jun  7 17:00:11 2004
Subject: The XML spec in XML: missing tags
Message-ID: <01bd407b$115e7500$1e09e391@mhklaptop.bra01.icl.co.uk>

I have been playing with the BNF rules in the XML spec as an exercise in XML
tagging.

I noticed that in the XML version of the XML spec, the non-terminal symbol
"S" is incorrectly tagged in rules 60, 62, and 63, and in consequence it is
not
hyperlinked in the HTML version.

Some comments on the XML tagging in the BNF rules:
- it is useful to have the non-terminals tagged, though the way in which it
done is a little clumsy, since the internal identifier and the visible name
of the non-terminal are necessarily in a one-to-one correspondence. The way
it is done seems designed primarily to enable a particular translation to
HTML.
- it is a shame that there is no tagging to distinguish terminal symbols
from metasymbols, since this would enable nicer renditions of the rules,
e.g. exploiting colour, without having to parse the BNF
- it would seem more logical for each rule to have a single <rhs>, with any
<vc> and <wfc> constraints being embedded within the <rhs>, rather than
these being separate elements interspersed among multiple <rhs> elements.

Two comments on the definition of notation in section 6:
- the distinction between non-terminals with an initial upper case and those
with an initial lower case is not at all clear (to me).
- the precedence of the metalanguage operators (e.g. that "A B | C" means
"(A B) | C" is not stated.

Thanks to Peter M-R for prompting me to look at this XML exemplar, it has
been very stimulating!

Mike Kay


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From msuzio at ford.com  Mon Feb 23 16:50:27 1998
From: msuzio at ford.com (Michael J. Suzio)
Date: Mon Jun  7 17:00:11 2004
Subject: xml:space
References: <3.0.1.16.19980220223432.3ea70dee@pop3.demon.co.uk>
Message-ID: <199802231650.AA06071@mailfw1.ford.com>

What I wonder is, how does SAX decide what is ignorable
whitespace and what is significant?  I'm not clear on how that
works, and the role xml:space plays in defining that.  
Ignoring whitespace is one of the most tedious things I keep doing
in my XML parsing apps, I'd prefer to have to explicitly *work* to
keep whitespace.
What I don't understand is, given something like this in a DTD:

<!ELEMENT QUOTE (SOURCE?|LINE+|KEY+)>

Why wouldn't *any* character data located within
<QUOTE></QUOTE> (and not inside one of it's child
elements) be ignorable?  I'd expect a parser seeing this:
<QUOTE>
 <SOURCE href="http://www.quotesrus.com/">
 <LINE>This is line 1 of the quote</LINE>
</QUOTE>

To ignore those carriage returns and extraneous spaces within the
QUOTE element, and just give me the SOURCE and LINE elements and
their content.

Sorry if this is a stupid question, but it has been bugging me the
last couple weeks.

-- 
Michael J. Suzio
Web Technical Standards, WWW & Internet Applications
(313) 24-88120
msuzio@eccms1.dearborn.ford.com / msuzio@ford.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From msuzio at ford.com  Mon Feb 23 16:59:20 1998
From: msuzio at ford.com (Michael J. Suzio)
Date: Mon Jun  7 17:00:11 2004
Subject: finalising org.sax.xml.Parser
References: <001801bd406f$18cbd140$2ee044c6@donpark> <199802231609.LAA01939@unready.microstar.com>
Message-ID: <199802231658.AA08077@mailfw1.ford.com>

I think keeping the method with Public ID is fine, but if in
many cases we're just passing NULL as the first arg, why don't
we have a method which just accepts the system ID/URI?  I
myself have no use for Public ID, so I essentially always
just pass in NULL, which to me makes the code look confusing...

(I hate NULL/ignored parameters, especially as the first arg, I
usually rank args in order of "importance" to the method/procedure).

-- 
Michael J. Suzio
Web Technical Standards, WWW & Internet Applications
(313) 24-88120
msuzio@eccms1.dearborn.ford.com / msuzio@ford.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at quake.net  Mon Feb 23 17:05:27 1998
From: donpark at quake.net (Don Park)
Date: Mon Jun  7 17:00:11 2004
Subject: Binary Data
Message-ID: <000f01bd407c$3d9e9f90$2ee044c6@donpark>

Michael,

Check out the XML-Binary demo at
http://www.quake.net/~donpark/SaxDomDemo/SaxDomDemo.html

Binary.xml file contains an element with embedded binary data.

I do not like notation based solution to binary data because it requires DTD
processing.  IMHO, High performance XML applications will opt to ignore DTD
because it requires additional resources as well as causing processing
hiccups.  XML-Binary is being designed around a set of reserved attributes
which tells you how the data was encoded (base64) and what the data is
(image/gif).  All this can be done easily by checking for the attributes in
a single-pass processing systems.  It also allows specification of
multi-layer encoding of binary data so that your application can easily tell
that an XML element contains postscript image which as compressed using ZIP
and then encoded using BASE64.

Don Park
http://www.quake.net/~donpark/index.html


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tyler at infinet.com  Mon Feb 23 17:15:43 1998
From: tyler at infinet.com (Tyler Baker)
Date: Mon Jun  7 17:00:11 2004
Subject: xml:space
References: <3.0.1.16.19980220223432.3ea70dee@pop3.demon.co.uk> <199802231650.AA06071@mailfw1.ford.com>
Message-ID: <34F1AF9F.77CD5499@infinet.com>

Michael J. Suzio wrote:

> What I wonder is, how does SAX decide what is ignorable
> whitespace and what is significant?  I'm not clear on how that
> works, and the role xml:space plays in defining that.
> Ignoring whitespace is one of the most tedious things I keep doing
> in my XML parsing apps, I'd prefer to have to explicitly *work* to
> keep whitespace.
> What I don't understand is, given something like this in a DTD:

I think for problems like this, the application should just filter it all out
itself which is very simple.

Here is an inefficient implementation that will do just that for you in Java for
instance:

String data = "Fee           Fi          Fo\n\n\n        Fum\t\t\t    ";

java.util.StringTokenizer st = new StringTokenizer(data);
StringBuffer buffer = new StringBuffer();
while (st.hasMoreTokens()) {
  buffer.append(st.nextToken());
  buffer.append(' ');
}
buffer.setLength(buffer.length()-1);

String result = buffer.toString();

Result should be "Fee Fi Fo Fum"


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ak117 at freenet.carleton.ca  Mon Feb 23 17:15:51 1998
From: ak117 at freenet.carleton.ca (David Megginson)
Date: Mon Jun  7 17:00:11 2004
Subject: xml:space
In-Reply-To: <199802231650.AA06071@mailfw1.ford.com>
References: <3.0.1.16.19980220223432.3ea70dee@pop3.demon.co.uk>
	<199802231650.AA06071@mailfw1.ford.com>
Message-ID: <199802231713.MAA02467@unready.microstar.com>

Michael J. Suzio writes:

 > What I wonder is, how does SAX decide what is ignorable
 > whitespace and what is significant?  I'm not clear on how that
 > works, and the role xml:space plays in defining that.  
 > Ignoring whitespace is one of the most tedious things I keep doing
 > in my XML parsing apps, I'd prefer to have to explicitly *work* to
 > keep whitespace.

SAX itself is not a program, but its interface allows DTD-driven
parsers to make the distinction described in clause 2.10 (AElfred
takes advantage of the distinction):

  2.10 White Space Handling

   In editing XML documents, it is often convenient to use "white space"
   (spaces, tabs, and blank lines, denoted by the nonterminal S in this
   specification) to set apart the markup for greater readability. Such
   white space is typically not intended for inclusion in the delivered
   version of the document. On the other hand, "significant" white space
   that should be preserved in the delivered version is common, for
   example in poetry and source code.
           
   An XML processor must always pass all characters in a document that
   are not markup through to the application. A validating XML processor
   must also inform the application which of these characters constitute
   white space appearing in element content.

Note that this has nothing to do with the `xml:space' attribute -- it
is your application, rather than the XML parser, that is allowed to
act on that one.

 > What I don't understand is, given something like this in a DTD:
 > 
 > <!ELEMENT QUOTE (SOURCE?|LINE+|KEY+)>
 > 
 > Why wouldn't *any* character data located within
 > <QUOTE></QUOTE> (and not inside one of it's child
 > elements) be ignorable?  I'd expect a parser seeing this:
 > <QUOTE>
 >  <SOURCE href="http://www.quotesrus.com/">
 >  <LINE>This is line 1 of the quote</LINE>
 > </QUOTE>
 > 
 > To ignore those carriage returns and extraneous spaces within the
 > QUOTE element, and just give me the SOURCE and LINE elements and
 > their content.

Absolutely correct.  If your XML parser is DTD-driven (as AElfred is),
it should somehow flag the carriage returns and leading spaces in your
example as ignorable.  It is a major pain having to deal with this
kind of thing yourself, if your parser is not DTD-aware.


All the best,


David

-- 
David Megginson                 ak117@freenet.carleton.ca
Microstar Software Ltd.         dmeggins@microstar.com
      http://home.sprynet.com/sprynet/dmeggins/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mike at jmaca.com  Mon Feb 23 17:22:33 1998
From: mike at jmaca.com (Michael Emmel)
Date: Mon Jun  7 17:00:11 2004
Subject: Binary Data Resolved
References: <000f01bd407c$3d9e9f90$2ee044c6@donpark>
Message-ID: <34F1B1ED.FA5F78B4@jmaca.com>

Don Park wrote:

> Michael,
>
> Check out the XML-Binary demo at
> http://www.quake.net/~donpark/SaxDomDemo/SaxDomDemo.html
>
> Binary.xml file contains an element with embedded binary data.

Thanks!!
 Another poster also suggestion that the packaging of various entities that
make up and XML documnet is outside of the XML spec.
I agree so I think I'll work on my idea of a jar like file with a XML header.
Very cool IMHO.
and save the Base64 encoding for special circumstances.
There does need to be a standard way to transmit all the  "static"
data that makes up a complete  xml document and other complex data soruces.


And thanks to all  who helped me resolve this it was very important to me.

Mike

mike@jmaca.com


Private post:

Subject:
             Re: Binary Data
       Date:
             Mon, 23 Feb 1998 12:04:04 -0500
       From:
             David Megginson <dmeggins@microstar.com>
         To:
             mike@jmaca.com
 References:
             1 , 2 , 3 , 4


Michael Emmel writes:

 > Failing that your left with coming up with a standard way to
 > "package" all internal links.

I think that that is by far a better solution -- kludges (like
embedding all objects in a single XML file) are sometimes necessary to
get something working, but we don't want to codify them in a spec if
we can avoid doing so.  A good, general Internet packaging protocol
would solve many problems both inside and outside XML.

In the mean time, you can use base64 if you really need to.


All the best,


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ricko at allette.com.au  Mon Feb 23 17:22:52 1998
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun  7 17:00:12 2004
Subject: Binary Data
Message-ID: <003701bd407f$afc15830$7b0b4ccb@NT.JELLIFFE.COM.AU>


From: Michael Emmel <mike@jmaca.com>

>This says to me that binary data is required to either be encoded to ascii
to
>be included, or have Mime type boundries for XML tags with  binary data
> not containing the mime boundries included.
>In the document or be obtained from a ascii normalized external URI link.


Binary data can only be included in a parseable entity if it is first
encoded
in some way which
1) does not contain delimiters which may cause false triggering
2) does not contain any characters which the XML "SGML declaration"
says are unused (or shunned).
Base64 is one such encoding. Other encodings may be more efficient
if you have a 16-bit data stream.

The way to signal you are using an encoding is to use an element
with a notation attribute.

If you embed binary data with MIME type boundaries, you no longer
have a parseable XML entity, you have a MIME multipart file which
can be processed to generate an XML entity.

>There is no way to tell a XML arser to skip x number of  arbitrary bytes of
>embedded unparsed entity  data which is consumed by the "application"  and
then
>restart the parser
>at the next valid section.

An XML parser is not interested in the contents of a non-XML-parseable
entity. Indexing into binary data is either done before the parser (i.e. by
embedding the appropriate instructions in the system identifier of the
entity) or by the application after the parser.

>Am I wrong ???


What do you mean "restart the parser"?  Parsing continues after an entity
reference.

Rick Jelliffe


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From msuzio at ford.com  Mon Feb 23 17:30:52 1998
From: msuzio at ford.com (Michael J. Suzio)
Date: Mon Jun  7 17:00:12 2004
Subject: xml:space
References: <3.0.1.16.19980220223432.3ea70dee@pop3.demon.co.uk>
		<199802231650.AA06071@mailfw1.ford.com> <199802231713.MAA02467@unready.microstar.com>
Message-ID: <199802231730.AA15010@mailfw1.ford.com>

OK, to be more precise, the problem I think I'm seeing is that,
using an XML example, like this:

<QUOTE>
  <SOURCE href="http://www.quotesrus.com/">
  <LINE>This is line 1 of the quote</LINE>
</QUOTE>

I would expect (using SAX) to receive an ignorable() event when
the end of the opening QUOTE tag is reached, and the "\n " string
found.  I'm not seeing that, using the DXP implementation.  Should
I?  I'm not sure if I see what circumstances actually alert
a parser that, yes, this whitespace is *not* significant.  I
know it is supposed to pass the data to the application, but the
data is also supposed to be flagged, correct?

-- 
Michael J. Suzio
Web Technical Standards, WWW & Internet Applications
(313) 24-88120
msuzio@eccms1.dearborn.ford.com / msuzio@ford.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jjc at jclark.com  Tue Feb 24 04:02:06 1998
From: jjc at jclark.com (James Clark)
Date: Mon Jun  7 17:00:12 2004
Subject: SAX: finalising org.sax.xml.Parser
References: <199802230313.WAA00386@unready.microstar.com>
Message-ID: <34F23D1B.E6172400@jclark.com>

>    public void parse (InputStream is, String baseURI)
>      throws java.lang.Exception;
>    public void parse (char ch[], int start, int length, String baseURI)
>      throws java.lang.Exception;

I don't think this last one is a good idea.  If you want something that
operates on a stream of characters as opposed to bytes, it should be

  void parse(Reader r, String baseURI)

Using an array of chars is as bad an idea as it would be to replace the
InputStream method with a method that operates on an array of bytes.

I am not convinced this really buys you anything.  It's easy enough to
write an InputStream that takes an array of chars and presents then as a
sequence of UTF-16 encoded bytes.  It also raise some problems since the
XML spec doesn't define the operation of a processor on an sequence of
chars.  For example, what if anything should the processor do with an
encoding declaration in this case?

If you don't want to put Readerin to avoid dependency on JDK 1.1, I
would suggest simply leaving this out for now.

James


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Jon.Bosak at Eng.Sun.COM  Tue Feb 24 05:13:13 1998
From: Jon.Bosak at Eng.Sun.COM (Jon Bosak)
Date: Mon Jun  7 17:00:12 2004
Subject: Last call for submissions: XML Developers' Day
Message-ID: <199802240511.VAA29721@boethius.eng.sun.com>

Reminder: the deadline for submissions is this Friday, February 27.
See the original notice below for details.

Jon

========================================================================

CALL FOR PRESENTATIONS: XML DEVELOPERS' DAY 1998.03.27

A one-day technical conference for XML developers will be held Friday,
March 27, in Seattle, Washington.  The event constitutes the last day
of the GCA XML Conference (http://www.gca.org/conf/xmlcon98/).

XML Developers' Day is a single-track event devoted entirely to
technical reports on the latest developments in XML implementation.
If you are engaged in the construction of any software that works with
XML -- converters, parsers, servers, browsers, editors, or XML-based
vertical applications -- here is your chance to share your work with
an audience that can understand and appreciate it.

Since stylesheet-based rendering is part of XML publishing, developers
of tools that support XSL or DSSSL are invited to show their latest
offerings as well.  We're also open to presentations on XML-based
languages (CML, OFX, etc.)  and related efforts that might have a
significant impact on the future of XML (RDF, XML-Data, etc.) if they
are of particular interest to XML developers.

Vendors of commercial tools can participate, but they must confine
their presentations to the technical aspects of current XML products
in development.  Table space will be made available for the
distribution of product announcements and commercial literature.

REGISTRATION

The registration fee for XML Developers' Day is $275 for GCA members
and $390 for non-GCA members (see the registration page below for
conference and tutorial rates).  This is mighty inexpensive for an
inside update on the very latest activity in this field.  You can
register at

   http://www.gca.org/conf/xmlcon98/registra.htm

N.B.: Presenters get in free.

CALL FOR PRESENTATIONS

If you would like to give a report at this event, send a paragraph or
two describing your presentation, based on a conservative estimate of
the status of your project as it will stand on March 27, to Jon Bosak
(bosak@eng.sun.com).  Also include a description of the audio-visual
equipment you will need for your presentation and an estimate of its
duration.  Please include the phrase "XML Dev Day" somewhere in the
subject line of your message.

Since we want up-to-the-minute reports on activities in progress,
there will be no published proceedings, and therefore you need not
submit your entire presentation in advance.  But please try to make
your forecasted description as accurate as possible so that we can
choose the most interesting and relevant submissions.

The deadline for submissions is Friday, February 27.

Jon

----------------------------------------------------------------------
 Jon Bosak, Online Information Technology Architect, Sun Microsystems
    901 San Antonio Road, MPK17-101, Palo Alto, California 94043
----------------------------------------------------------------------
   If a man look sharply and attentively, he shall see Fortune; for
   though she be blind, yet she is not invisible.  -- Francis Bacon
----------------------------------------------------------------------

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at quake.net  Tue Feb 24 06:09:02 1998
From: donpark at quake.net (Don Park)
Date: Mon Jun  7 17:00:12 2004
Subject: SAX: finalising org.sax.xml.Parser
Message-ID: <002401bd40e9$fde8c510$2ee044c6@donpark>

>I don't think this last one is a good idea.  If you want something that
>operates on a stream of characters as opposed to bytes, it should be
>
>  void parse(Reader r, String baseURI)
>
>Using an array of chars is as bad an idea as it would be to replace the
>InputStream method with a method that operates on an array of bytes.
>
>I am not convinced this really buys you anything.  It's easy enough to
>write an InputStream that takes an array of chars and presents then as a
>sequence of UTF-16 encoded bytes.  It also raise some problems since the
>XML spec doesn't define the operation of a processor on an sequence of
>chars.  For example, what if anything should the processor do with an
>encoding declaration in this case?

If I remember correctly, what David is trying to do is provide us with means
to parse XML data from a byte stream as well as character stream.  Since
Reader will actually hide the byte-based aspect of the data stream, it in
inappropriate for our purpose.

XML character stream is also very useful when XML data is generated and
processed within a framework.  In such a system, converting character
streams to byte stream and then converting it back to character stream is
unnecessary.

As far as what to do with encoding information when dealing with character
streams, will there be any problem if SAX just ignored it?

Regards,

Don Park
http://www.quake.net/~donpark/index.html


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From M.H.Kay at eng.icl.co.uk  Tue Feb 24 10:59:34 1998
From: M.H.Kay at eng.icl.co.uk (Michael Kay)
Date: Mon Jun  7 17:00:12 2004
Subject: finalising org.sax.xml.Parser
Message-ID: <01bd4113$77fa0520$1e09e391@mhklaptop.bra01.icl.co.uk>

>>From: David Megginson [SMTP:ak117@freenet.carleton.ca]
[heavily cut]
>>After considering the various discussions over the past few weeks, I
>>propose that we make the following changes:
>>
>>1) Add a parse() method that accepts a stream.
>>2) Add a parse() method that accepts a character buffer.
>>With these changes, the interface would look like this in Java:
>>

>>   public void parse (InputStream is, String baseURI)
>>     throws java.lang.Exception;
>>   public void parse (char ch[], int start, int length, String baseURI)
>>     throws java.lang.Exception;
>>NOTES:
>>
>>a. The baseURI argument is necessary for streams and character buffers
>>   in case either contains a relative URI.  You can supply a null
>>   value if the document entity will not contain relative URIs.
>>
Comments:
1. Is the (ch, start, length) method really necessary, given that one can
supply a StringReader or whatever to the parse(InputStream) method?
2. If my "main" XML document is in a record in a database, then it is very
likely that any other entities referred to will be in the database as well.
Therefore, I think the logical approach in this situation is for the
application to resolve all URIs encountered: the parser should call the
application supplying a URI and the application should return an InputStream
to allow the parser to read it. This should presumably be done via the
EntityHandler interface.

And a question: is there a recommended way to abort a parse once the
application has got the information it needs (e.g extracting the contents of
the TITLE element)? Would an interface like parser.abort() be cleaner than
playing around with exceptions? I ask because in handling the results of a
free text search, I am parsing all the retrieved documents when I only need
a bit of text from the beginning of each, and this is obviously wasteful. I
thought perhaps of supplying a stream and generating a premature
end-of-file, and then trapping the exception that comes back.

Regards, Mike Kay


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From M.H.Kay at eng.icl.co.uk  Tue Feb 24 11:55:28 1998
From: M.H.Kay at eng.icl.co.uk (Michael Kay)
Date: Mon Jun  7 17:00:12 2004
Subject: finalising org.sax.xml.Parser
Message-ID: <01bd411b$2f327400$1e09e391@mhklaptop.bra01.icl.co.uk>

>Would anyone prefer _not_ to see public IDs, then?

I'm not fundamentally opposed to them, but I can't see much point in them
either. The XML spec defines no semantics for a public identifier and we are
left to guess that it might have a similar meaning to a similar construct in
SGML. They are one of the bits of SGML legacy which should have been taken
out. As they're in XML it might make sense to support them in SAX: the
problem is that if you do so, you have to say what they mean.

(Actually system identifiers aren't very well explained either: we are told
they are URI's and there's no definitive statement of what a URI is. The
difference is that most readers can guess).

Mike Kay


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ak117 at freenet.carleton.ca  Tue Feb 24 13:48:16 1998
From: ak117 at freenet.carleton.ca (David Megginson)
Date: Mon Jun  7 17:00:12 2004
Subject: SAX: finalising org.sax.xml.Parser
In-Reply-To: <002401bd40e9$fde8c510$2ee044c6@donpark>
References: <002401bd40e9$fde8c510$2ee044c6@donpark>
Message-ID: <199802241346.IAA00395@unready.microstar.com>

Don Park writes:

 > If I remember correctly, what David is trying to do is provide us with means
 > to parse XML data from a byte stream as well as character stream.  Since
 > Reader will actually hide the byte-based aspect of the data stream, it in
 > inappropriate for our purpose.
 >
 > XML character stream is also very useful when XML data is generated and
 > processed within a framework.  In such a system, converting character
 > streams to byte stream and then converting it back to character stream is
 > unnecessary.

This is true, but I think that James's point is well taken.  The
character _buffer_ doesn't really buy us anything.  I am reluctant to
use a character reader for two reasons:

1) It is a concept that doesn't translate well to languages other than
   Java (or even to Java 1.0.2 for that matter).

2) It imposes another architectural requirement on SAX-conformant
   parsers (the ability to receive characters directly, bypassing the
   normal input mechanisms), and I'm trying to keep interference to a
   minimum.

It is slightly inefficient to go from characters to a byte stream to
characters, but it's not that bad (especially if we use ISO-8859-1 or
UCS-2 for the encoding), and it keeps SAX simple and general.  Given
the discussion so far, then, we are ending up with something like
this:

  package org.xml.sax;
  import java.io.InputStream;

  public interface Parser {

    public abstract void setEntityHandler (EntityHandler handler);
    public abstract void setDocumentHandler (DocumentHandler handler);
    public abstract void setErrorHandler (ErrorHandler handler);

    public abstract void parse (String publicId, String systemId)
      throws java.lang.Exception;
    public abstract void parse (String publicId, String systemId,
                                InputStream inputStream)
      throws java.lang.Exception;

  }

If you need more, you can always extend the interface:

  package com.acme.xml;
  import java.io.Reader;

  public interface SuperParser extends org.xml.sax.Parser {

    public abstract void parse (String publicId, String systemId,
                                Reader reader)
      throws java.lang.Exception;

  }

In an ideal world, we'd also have some kind of ability to ask to
parser to turn validation on or off, but I'm not certain that that's
practical: any thoughts?


All the best,


David

-- 
David Megginson                 ak117@freenet.carleton.ca
Microstar Software Ltd.         dmeggins@microstar.com
      http://home.sprynet.com/sprynet/dmeggins/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ak117 at freenet.carleton.ca  Tue Feb 24 13:59:52 1998
From: ak117 at freenet.carleton.ca (David Megginson)
Date: Mon Jun  7 17:00:12 2004
Subject: finalising org.sax.xml.Parser
In-Reply-To: <01bd4113$77fa0520$1e09e391@mhklaptop.bra01.icl.co.uk>
References: <01bd4113$77fa0520$1e09e391@mhklaptop.bra01.icl.co.uk>
Message-ID: <199802241358.IAA00435@unready.microstar.com>

Michael Kay writes:

 > Comments:
 > 1. Is the (ch, start, length) method really necessary, given that one can
 > supply a StringReader or whatever to the parse(InputStream) method?

James has convinced me that it's not -- I'm actually happy to drop it,
since I want to keep the interfaces as simple as possible both to
learn and to implement.

 > 2. If my "main" XML document is in a record in a database, then it is very
 > likely that any other entities referred to will be in the database as well.
 > Therefore, I think the logical approach in this situation is for the
 > application to resolve all URIs encountered: the parser should call the
 > application supplying a URI and the application should return an InputStream
 > to allow the parser to read it. This should presumably be done via the
 > EntityHandler interface.

I have considered this approach, but I can anticipate two problems:

1) It puts the burdon of resolving URIs on the application rather than
   the parser.

2) It is possible that some programming languages or libraries do not
   represent network connections as input streams.

If (2) isn't a problem, we might find a way to work around (1).  I'll
be coming back to the EntityHandler interface in a future posting, and
we can take up the issue again then.

 > And a question: is there a recommended way to abort a parse once the
 > application has got the information it needs (e.g extracting the contents of
 > the TITLE element)? Would an interface like parser.abort() be cleaner than
 > playing around with exceptions? I ask because in handling the results of a
 > free text search, I am parsing all the retrieved documents when I only need
 > a bit of text from the beginning of each, and this is obviously wasteful. I
 > thought perhaps of supplying a stream and generating a premature
 > end-of-file, and then trapping the exception that comes back.

In languages that support exceptions (Java, C++, Perl, and sort-of C),
an exception is probably the cleanest way to handle this.  It also
lets you pass application-specific information back to the top level
within your exception.


All the best,


David

-- 
David Megginson                 ak117@freenet.carleton.ca
Microstar Software Ltd.         dmeggins@microstar.com
      http://home.sprynet.com/sprynet/dmeggins/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ak117 at freenet.carleton.ca  Tue Feb 24 14:24:39 1998
From: ak117 at freenet.carleton.ca (David Megginson)
Date: Mon Jun  7 17:00:12 2004
Subject: SAX: multiple handlers
Message-ID: <199802241423.JAA00516@unready.microstar.com>

In a private message, one SAX user raised the issue again of multiple
handlers.  The user suggested the situation where someone wants to
extract information from a document _and_ copy the document to an
OutputStream at the same time: for a clean implementation, each of
these should be in a different handler.

During the last round, most people vetoed this idea.  Here it is
again, though, for your consideration:

  package org.xml.sax;
  import java.io.InputStream;

  public interface Parser {

    public void addEntityHandler (EntityHandler handler);
    public void removeEntityHandler (EntityHandler handler);

    public void addDocumentHandler (DocumentHandler handler);
    public void removeDocumentHandler (DocumentHandler handler);

    public void addErrorHandler (ErrorHandler handler);
    public void removeErrorHandler (ErrorHandler handler);

    public void parse (String publicId, String systemId)
      throws java.lang.Exception;

    public void parse (String publicId, String systemId,
                       InputStream inputStream)
      throws java.lang.Exception;

  }

Any further thoughts on this issue?


All the best,


David

-- 
David Megginson                 ak117@freenet.carleton.ca
Microstar Software Ltd.         dmeggins@microstar.com
      http://home.sprynet.com/sprynet/dmeggins/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jmodre at edu.uni-klu.ac.at  Tue Feb 24 14:31:59 1998
From: jmodre at edu.uni-klu.ac.at (Juergen Modre)
Date: Mon Jun  7 17:00:12 2004
Subject: SAX: finalising org.sax.xml.Parser
References: <199802230313.WAA00386@unready.microstar.com>
Message-ID: <34F2E818.1FC4A30B@edu.uni-klu.ac.at>

David Megginson wrote:
> After considering the various discussions over the past few weeks, I
> propose that we make the following changes:
> 
> 1) Add a parse() method that accepts a stream.
Fully agree.

> 2) Add a parse() method that accepts a character buffer.
I have similar thoughts like James and therefore don't really see the need for it.
For the case to parse parts from an larger document the char[] can always be
converted to an InputStream to be used with 1).
But maybe your intention goes into another direction.

> 3) Remove public ID from the current parse() method (I don't think
>    public IDs are going anywhere fast in XML).
I propose to have a publicID.
E.g. the XML parser DXP supports public identifiers.

> With these changes, the interface would look like this in Java:
>    public void parse (String uri)
>      throws java.lang.Exception;
SGML/XML friendly "systemId" vs. Web-hacker-friendly "URI" as parameter name:
 I personally don't care to much about the name, both are appropiate.
 Maybe in a method with publicId the name "systemId" is better readable.
 Both names are fine as long as the are good described/documented
 (e.g. in the javadoc header in Java) to explain everybody the meaning.
         
> NOTES:
> 
> a. The baseURI argument is necessary for streams and character buffers
>    in case either contains a relative URI.  You can supply a null
>    value if the document entity will not contain relative URIs.
The baseURI gives you all information to parse every relative
EntityReference correctly. What's still missing is the name of the
document where the parsing started. So this name will miss in
an error-message in the starting entity. 

So I propose to have:
 public abstract void parse (String publicId, String systemId, InputStream inputStream)
instead of 
 public void parse (InputStream is, String baseURI)


-----------------------------------------------
 JUERGEN MODRE
 Reisdorf 6
 A-9371 Brueckl
 Austria (Europe)

 Phone:   ++43 4214 2320
 Mobile:  ++43 664 233 22 22
 E-mail:  jmodre@edu.uni-klu.ac.at
 WWW:     http://www.edu.uni-klu.ac.at/~jmodre
-----------------------------------------------

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From gmckenzi at JetForm.com  Tue Feb 24 14:57:01 1998
From: gmckenzi at JetForm.com (Gavin McKenzie)
Date: Mon Jun  7 17:00:12 2004
Subject: multiple handlers
Message-ID: <c=CA%a=_%p=JetForm%l=ROSSINI-980224144959Z-25614@rossini.jetform.com>


I like the idea of add/remove versus set.  In the Java case it meshes
nicely with other Java event mechanisms.  From a non-Java biased
perspective it does offer considerable extra flexibility in a simple
manner.

Though I don't have a strict requirement for it today, I'd vote for it.

Gavin.

>-----Original Message-----
>From:	David Megginson [SMTP:ak117@freenet.carleton.ca]
>Sent:	Tuesday, February 24, 1998 9:24 AM
>To:	xml-dev Mailing List
>Subject:	SAX: multiple handlers
>
>In a private message, one SAX user raised the issue again of multiple
>handlers.  The user suggested the situation where someone wants to
>extract information from a document _and_ copy the document to an
>OutputStream at the same time: for a clean implementation, each of
>these should be in a different handler.
>
>During the last round, most people vetoed this idea.  Here it is
>again, though, for your consideration:
>
>  package org.xml.sax;
>  import java.io.InputStream;
>
>  public interface Parser {
>
>    public void addEntityHandler (EntityHandler handler);
>    public void removeEntityHandler (EntityHandler handler);
>
>    public void addDocumentHandler (DocumentHandler handler);
>    public void removeDocumentHandler (DocumentHandler handler);
>
>    public void addErrorHandler (ErrorHandler handler);
>    public void removeErrorHandler (ErrorHandler handler);
>
>    public void parse (String publicId, String systemId)
>      throws java.lang.Exception;
>
>    public void parse (String publicId, String systemId,
>                       InputStream inputStream)
>      throws java.lang.Exception;
>
>  }
>
>Any further thoughts on this issue?
>
>
>All the best,
>
>
>David
>
>-- 
>David Megginson                 ak117@freenet.carleton.ca
>Microstar Software Ltd.         dmeggins@microstar.com
>      http://home.sprynet.com/sprynet/dmeggins/
>
>xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
>Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
>To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
>(un)subscribe xml-dev
>To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
>subscribe xml-dev-digest
>List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jmodre at edu.uni-klu.ac.at  Tue Feb 24 15:01:50 1998
From: jmodre at edu.uni-klu.ac.at (Juergen Modre)
Date: Mon Jun  7 17:00:12 2004
Subject: SAX: finalising org.sax.xml.Parser
References: <002401bd40e9$fde8c510$2ee044c6@donpark> <199802241346.IAA00395@unready.microstar.com>
Message-ID: <34F2EF37.8979C8DE@edu.uni-klu.ac.at>

> In an ideal world, we'd also have some kind of ability to ask to
> parser to turn validation on or off, but I'm not certain that that's
> practical: any thoughts?
I thinks that is practical and necessary.

One solution would be to have methods like:
 void setValidation(boolean validation)
 boolean getValidation()

These methods can be called before starting to parse with
the parse() method.


I also think a parse method with an systemId only as parameter would be
convenient. (With targeting to users rather new to XML
and not very used to the publicId's).

public abstract void parse (String systemId)

This would also avoid the need to call every time
entityHandler.resolveEntity() to resolve the Entity.


-----------------------------------------------
 JUERGEN MODRE
 Reisdorf 6
 A-9371 Brueckl
 Austria (Europe)

 Phone:   ++43 4214 2320
 Mobile:  ++43 664 233 22 22
 E-mail:  jmodre@edu.uni-klu.ac.at
 WWW:     http://www.edu.uni-klu.ac.at/~jmodre
-----------------------------------------------

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From M.H.Kay at eng.icl.co.uk  Tue Feb 24 15:03:06 1998
From: M.H.Kay at eng.icl.co.uk (Michael Kay)
Date: Mon Jun  7 17:00:12 2004
Subject: multiple handlers
Message-ID: <01bd4135$5b2893e0$1e09e391@mhklaptop.bra01.icl.co.uk>


>In a private message, one SAX user raised the issue again of multiple
>handlers
>Any further thoughts on this issue?
>
I've implemented a layer on top of SAX that provides not only multiple
handlers, but also per-element-type handlers. Since it is trivial to
implement this on top of SAX, I suggest it shouldn't go into SAX itself.

(The way you do multiple handler is to write a class MultiHandler that
implements the DocumentHandler interface and accepts in its constructor two
DocumentHandlers; the methods then call these two in turn. Of course either
of them can itself be a MultiHandler).

Mike Kay


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tyler at infinet.com  Tue Feb 24 15:05:49 1998
From: tyler at infinet.com (Tyler Baker)
Date: Mon Jun  7 17:00:12 2004
Subject: SAX: finalising org.sax.xml.Parser
References: <199802230313.WAA00386@unready.microstar.com> <34F2E818.1FC4A30B@edu.uni-klu.ac.at>
Message-ID: <34F2E22C.D06BCA45@infinet.com>

Juergen Modre wrote:

> David Megginson wrote:
> > After considering the various discussions over the past few weeks, I
> > propose that we make the following changes:
> >
> > 1) Add a parse() method that accepts a stream.
> Fully agree.
>
> > 2) Add a parse() method that accepts a character buffer.
> I have similar thoughts like James and therefore don't really see the need for it.
> For the case to parse parts from an larger document the char[] can always be
> converted to an InputStream to be used with 1).
> But maybe your intention goes into another direction.

One way to get around the char[] array problem is to sort of have a feeder mechanism in
which you continually feed the parser a set of bytes like in the case of an input stream
except that you explicitly turn the parser on before feeding that parser the data and
explicitly turn the parser off when you are done feeding it.

For example you could have methods that looked like this:

Parser.start();
Parser.parseBuffer(char[] c);
Parser.end();

Then you could just go through a loop and feed in a character array you populate with the
document data until you are finished.  This of course would be much more straightforward
with an input stream, however this would get around the problem of languages which have no
concept of input streams.

The biggest problem I see with this suggestion is that it will make writing parsers a bit
more difficult to implement since you have to essentially freeze your parser's state after
each call to parseBuffer() finishes.

Just a suggestion,

Tyler

Tyler


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From drewn at icomm.co.uk  Tue Feb 24 15:08:01 1998
From: drewn at icomm.co.uk (Nick Drew)
Date: Mon Jun  7 17:00:12 2004
Subject: multiple handlers
Message-ID: <01BD4136.48D7E5F0@krusty.icomm.co.uk>

<..stuff deleted...>
During the last round, most people vetoed this idea.  Here it is
again, though, for your consideration:

  package org.xml.sax;
  import java.io.InputStream;

  public interface Parser {

    public void addEntityHandler (EntityHandler handler);
    public void removeEntityHandler (EntityHandler handler);

    public void addDocumentHandler (DocumentHandler handler);
    public void removeDocumentHandler (DocumentHandler handler);

    public void addErrorHandler (ErrorHandler handler);
    public void removeErrorHandler (ErrorHandler handler);

    public void parse (String publicId, String systemId)
      throws java.lang.Exception;

    public void parse (String publicId, String systemId,
                       InputStream inputStream)
      throws java.lang.Exception;

  }

Any further thoughts on this issue?


Apologies in advance: I'm quite new to the list, so missed this discussion first time around. 

It seems that the above suggestion isn't essential.  Perhaps there should be a standardised MulticastEntityHandler, MulticastDocumentHandler, and MulticastErrorHandler, which can be used instead, e.g.

{
	...
	MulticastDocumentHandler mdocHandler = new MyMulticastDocumentHandler();
	mdocHandler.addHandler( new ExistingDocumentHandler() );
	mdocHandler.addHandler( new AnotherExistingDocumentHandler() );

	...
	iParser.setEntityHandler( mdocHandler );
	...
}

and the MulticastDocumentHandler just delegates to its members as needed.


Nick Drew
icomm technologies ltd.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ak117 at freenet.carleton.ca  Tue Feb 24 18:52:50 1998
From: ak117 at freenet.carleton.ca (David Megginson)
Date: Mon Jun  7 17:00:12 2004
Subject: multiple handlers
In-Reply-To: <01bd4135$5b2893e0$1e09e391@mhklaptop.bra01.icl.co.uk>
References: <01bd4135$5b2893e0$1e09e391@mhklaptop.bra01.icl.co.uk>
Message-ID: <199802241851.NAA00358@unready.microstar.com>

Michael Kay writes:

 > >In a private message, one SAX user raised the issue again of multiple
 > >handlers
 > >Any further thoughts on this issue?
 > >
 > I've implemented a layer on top of SAX that provides not only multiple
 > handlers, but also per-element-type handlers. Since it is trivial to
 > implement this on top of SAX, I suggest it shouldn't go into SAX itself.

I had this same thought when I was walking my girls to school after
lunch.  Unlike a GUI, which spends most of its time waiting for the
user to do something interesting, an XML parser has to deal with
hundreds or thousands of events each second, and perhaps millions of
events in a hefty XML document.  

Upon reflection, I am becoming more inclined to agree with the
arguments that people made in the first round, that the overhead of
walking through a vector of handlers and delivering each event to each
one can be excessive.  Besides, as Michael rightly points out,
implementing a multi-listener interface on top of SAX is trivial if
you really need it.


All the best,


David

-- 
David Megginson                 ak117@freenet.carleton.ca
Microstar Software Ltd.         dmeggins@microstar.com
      http://home.sprynet.com/sprynet/dmeggins/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ak117 at freenet.carleton.ca  Tue Feb 24 19:11:29 1998
From: ak117 at freenet.carleton.ca (David Megginson)
Date: Mon Jun  7 17:00:12 2004
Subject: SAX: finalising org.sax.xml.Parser
In-Reply-To: <34F2EF37.8979C8DE@edu.uni-klu.ac.at>
References: <002401bd40e9$fde8c510$2ee044c6@donpark>
	<199802241346.IAA00395@unready.microstar.com>
	<34F2EF37.8979C8DE@edu.uni-klu.ac.at>
Message-ID: <199802241910.OAA00445@unready.microstar.com>

Juergen Modre writes:

 > > In an ideal world, we'd also have some kind of ability to ask to
 > > parser to turn validation on or off, but I'm not certain that that's
 > > practical: any thoughts?
 > I thinks that is practical and necessary.
 > 
 > One solution would be to have methods like:
 >  void setValidation(boolean validation)
 >  boolean getValidation()
 > 
 > These methods can be called before starting to parse with
 > the parse() method.

It's trickier than this -- for example, we'd probably have to create
an exception that is thrown if the underlying parser does not support
validation; furthermore, none of the parsers that I've looked at
supports a toggle like this, and we will be forcing another design
decision on them if we require this toggle.


 > I also think a parse method with an systemId only as parameter would be
 > convenient. (With targeting to users rather new to XML
 > and not very used to the publicId's).
 > 
 > public abstract void parse (String systemId)
 > 
 > This would also avoid the need to call every time
 > entityHandler.resolveEntity() to resolve the Entity.

It might be simpler, though I'm trying to keep the number of methods
to a minimum.  It wouldn't affect EntityHandler.resolveEntity(),
though, since that does not exist solely for the sake of handling
public identifiers.


Thanks, and all the best,


David

-- 
David Megginson                 ak117@freenet.carleton.ca
Microstar Software Ltd.         dmeggins@microstar.com
      http://home.sprynet.com/sprynet/dmeggins/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From gmckenzi at JetForm.com  Tue Feb 24 19:28:09 1998
From: gmckenzi at JetForm.com (Gavin McKenzie)
Date: Mon Jun  7 17:00:12 2004
Subject: SAX: finalising org.sax.xml.Parser
Message-ID: <c=CA%a=_%p=JetForm%l=ROSSINI-980224192251Z-27616@rossini.jetform.com>


David,

Something just occurred to me...and maybe its too late, but I thought
I'd mention it...

With SAX there is an assumption that the whole file will be parsed.  I'm
stuck if I'm parsing a 1 gigabyte file that contains 50,000
<TRANSACTION> elements (representing transactions of data), and I only
want the first transaction.

Would it be possible for a mechanism that could pause/resume/terminate a
parse?  Maybe a callback that returns either a 'continue', 'pause' or
'terminate' status value, and a resumeParse() method?  Or a method that
I can call from within the callback to pause the parsing.

I know that I could throw an exception from within one of my callbacks,
which will halt the parse...but it would be valuable to be able 'pause'
and 'resume'.

Gavin.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tyler at infinet.com  Tue Feb 24 20:09:40 1998
From: tyler at infinet.com (Tyler Baker)
Date: Mon Jun  7 17:00:12 2004
Subject: multiple handlers
References: <01bd4135$5b2893e0$1e09e391@mhklaptop.bra01.icl.co.uk> <199802241851.NAA00358@unready.microstar.com>
Message-ID: <34F3295B.26C6F728@infinet.com>

David Megginson wrote:

> Michael Kay writes:
>
>  > >In a private message, one SAX user raised the issue again of multiple
>  > >handlers
>  > >Any further thoughts on this issue?
>  > >
>  > I've implemented a layer on top of SAX that provides not only multiple
>  > handlers, but also per-element-type handlers. Since it is trivial to
>  > implement this on top of SAX, I suggest it shouldn't go into SAX itself.
>
> I had this same thought when I was walking my girls to school after
> lunch.  Unlike a GUI, which spends most of its time waiting for the
> user to do something interesting, an XML parser has to deal with
> hundreds or thousands of events each second, and perhaps millions of
> events in a hefty XML document.
>
> Upon reflection, I am becoming more inclined to agree with the
> arguments that people made in the first round, that the overhead of
> walking through a vector of handlers and delivering each event to each
> one can be excessive.  Besides, as Michael rightly points out,
> implementing a multi-listener interface on top of SAX is trivial if
> you really need it.

You don't need to actually use a Vector, but you could instead use an array or
just a single object if the Vector was of length one.  You may initially use a
Vector to store your the handlers, but when you are about to parse you could just
turn this into an array of handlers or else just a single handler.  There are a
lot of ways to go about this so any performance loss would be a function of how
many handlers you are using.  Nevertheless, SAX could just have a standard
MulticastHandler implementation that dispatches events to multiple handlers.  I
think it would be useful to include in the Java SAX distribution a generic class
to do this sort of thing.

Tyler


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From wilfr at mail.bc.rogers.wave.ca  Tue Feb 24 21:55:32 1998
From: wilfr at mail.bc.rogers.wave.ca (Wilf Reedijk)
Date: Mon Jun  7 17:00:12 2004
Subject: Modifying DTD using msxml
Message-ID: <34F34236.7FD472ED@rogers.wave.ca>

I would like to update the (internal) DTD for a document using msxml.

I am converting the DTD to a schema using the dtd.getSchema() method

I then modify the elements within the schema using addChild etc.

My question is: How do convert this schema back to the DOM so that it is
saved when the document is saved.


Thanks
Wilf Reedijk
wilfr@rogers.wave.ca


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From clovett at microsoft.com  Tue Feb 24 21:58:55 1998
From: clovett at microsoft.com (Chris Lovett)
Date: Mon Jun  7 17:00:12 2004
Subject: Modifying DTD using msxml
Message-ID: <2F2DC5CE035DD1118C8E00805FFE354C01906CAA@red-msg-56.dns.microsoft.com>

I assume you want to convert it back to the DTD syntax - you will have to do
this yourself.  MSXML doesn't have this feature yet.

> -----Original Message-----
> From:	Wilf Reedijk [SMTP:wilfr@mail.bc.rogers.wave.ca]
> Sent:	Tuesday, February 24, 1998 1:57 PM
> To:	xmldev
> Subject:	Modifying DTD using msxml
> 
> I would like to update the (internal) DTD for a document using msxml.
> 
> I am converting the DTD to a schema using the dtd.getSchema() method
> 
> I then modify the elements within the schema using addChild etc.
> 
> My question is: How do convert this schema back to the DOM so that it is
> saved when the document is saved.
> 
> 
> Thanks
> Wilf Reedijk
> wilfr@rogers.wave.ca
> 
> 
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
> message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jmodre at edu.uni-klu.ac.at  Tue Feb 24 22:36:01 1998
From: jmodre at edu.uni-klu.ac.at (Juergen Modre)
Date: Mon Jun  7 17:00:12 2004
Subject: SAX: finalising org.sax.xml.Parser
References: <002401bd40e9$fde8c510$2ee044c6@donpark>
		<199802241346.IAA00395@unready.microstar.com>
		<34F2EF37.8979C8DE@edu.uni-klu.ac.at> <199802241910.OAA00445@unready.microstar.com>
Message-ID: <34F359AB.99DB806D@edu.uni-klu.ac.at>

David Megginson wrote:
> 
> Juergen Modre writes:
> 
>  > > In an ideal world, we'd also have some kind of ability to ask to
>  > > parser to turn validation on or off, but I'm not certain that that's
>  > > practical: any thoughts?
>  > I thinks that is practical and necessary.
>  >
>  > One solution would be to have methods like:
>  >  void setValidation(boolean validation)
>  >  boolean getValidation()
>  >
>  > These methods can be called before starting to parse with
>  > the parse() method.
> 
> It's trickier than this -- for example, we'd probably have to create
> an exception that is thrown if the underlying parser does not support
> validation;
Correct. My example was just a first naive try.

> furthermore, none of the parsers that I've looked at
> supports a toggle like this, and we will be forcing another design
> decision on them if we require this toggle.
There are already XML parsers allowing this toggle.
 For instance DXP has this capability.

I think it would be good to have methods that allow to
set a parser into well-formedness or validation mode.

>  > I also think a parse method with an systemId only as parameter would be
>  > convenient. (With targeting to users rather new to XML
>  > and not very used to the publicId's).
>  >
>  > public abstract void parse (String systemId)
>  >
>  > This would also avoid the need to call every time
>  > entityHandler.resolveEntity() to resolve the Entity.
> 
> It might be simpler, though I'm trying to keep the number of methods
> to a minimum.
Okay.

All the best
 Juergen

-----------------------------------------------
 JUERGEN MODRE
 Reisdorf 6
 A-9371 Brueckl
 Austria (Europe)

 Phone:   ++43 4214 2320
 Mobile:  ++43 664 233 22 22
 E-mail:  jmodre@edu.uni-klu.ac.at
 WWW:     http://www.edu.uni-klu.ac.at/~jmodre
-----------------------------------------------

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From b.laforge at opengroup.org  Tue Feb 24 23:01:43 1998
From: b.laforge at opengroup.org (Bill la Forge)
Date: Mon Jun  7 17:00:12 2004
Subject: axtp zip available
Message-ID: <3.0.32.19980224180619.00922bf0@postman.osf.org>

I've had several requests to create a zip file for axtp.
I've done so. See http://www.camb.opengroup.org/~laforge/axtp/#related_links

(I've also cleaned up the relationship between the parsed xml object tree
and the application peer objects.)

And yes, I'm only using a subset of xml. But I think packet size is a big
issue here.

This has become strictly a spare-time project, and I still need to develop
the client and server api's before it can live up to the "easy to use" claim.
Perhaps this weekend...

Meanwhile, please keep those comments coming. 

b)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From peter at ursus.demon.co.uk  Tue Feb 24 23:30:41 1998
From: peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun  7 17:00:12 2004
Subject: multiple handlers
In-Reply-To: <199802241851.NAA00358@unready.microstar.com>
References: <01bd4135$5b2893e0$1e09e391@mhklaptop.bra01.icl.co.uk>
 <01bd4135$5b2893e0$1e09e391@mhklaptop.bra01.icl.co.uk>
Message-ID: <3.0.1.16.19980224215546.35877758@pop3.demon.co.uk>

At 13:51 24/02/98 -0500, David Megginson wrote:
>  Besides, as Michael rightly points out,
>implementing a multi-listener interface on top of SAX is trivial if
>you really need it.
>
As it's trivial, it would be a great help if a specimen were included in
SAX that those of us who are per-element people could use. Seriously, I'm
not quite sure what it would look like but I am sure I would recognise it
when I saw it :-)

	P.

Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic
net connection
VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary
http://www.venus.co.uk/vhg

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at quake.net  Tue Feb 24 23:51:01 1998
From: donpark at quake.net (Don Park)
Date: Mon Jun  7 17:00:12 2004
Subject: multiple handlers
Message-ID: <000d01bd417e$521bcbc0$2ee044c6@donpark>

>As it's trivial, it would be a great help if a specimen were included in
>SAX that those of us who are per-element people could use. Seriously, I'm
>not quite sure what it would look like but I am sure I would recognise it
>when I saw it :-)

This brings up the issue I wanted to bring up for a while:

"Should we add helper classes to SAX?"

HandlerBase sort of qualifies as a helper class but I think SAX should have
a lot more helper classes to help out SAX programmers.  For example, a
'pass-through' DocumentHandler that filters out whitespace would be a great
help.  An abstract implementation of DocumentHandler that takes maintains a
stack of ancestor elements would also be nice.  A special trigger like
DocumentHandler that will return specified patterns (i.e. XSL rule like
pattern).

I think we have four choices at this point:

1. Leave SAX alone!
2. Add some but as little as possible.
3. Go nuts and let SAX bloat as the months go by.
4. Start EZ-SAX (sorry, I couln't help it.  David picked a name ready-made
for puns) package to complement SAX.

Personally, I am all for EZ-SAX ;-p.

Regards,

Don Park
http://www.quake.net/~donpark/index.html


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From elm at arbortext.com  Wed Feb 25 00:30:26 1998
From: elm at arbortext.com (Eve L. Maler)
Date: Mon Jun  7 17:00:12 2004
Subject: The XML spec in XML: missing tags
In-Reply-To: <98Feb23.120356est.18826@thicket.arbortext.com>
Message-ID: <3.0.5.32.19980224192743.00a0d220@village.doctools.com>

As the maintainer of the specification DTD, let me say thanks for your
comments.

At 11:49 AM 2/23/98 -0500, Michael Kay wrote:
...
>Some comments on the XML tagging in the BNF rules:
>- it is useful to have the non-terminals tagged, though the way in which it
>done is a little clumsy, since the internal identifier and the visible name
>of the non-terminal are necessarily in a one-to-one correspondence. The way
>it is done seems designed primarily to enable a particular translation to
>HTML.

Are you saying that it's clumsy because the element content is duplicated
in the attribute value?  Since the XML is transformed into HTML, it would
actually have been easier to let the content serve as the address (and be
stuffed into both the final <a> element content and its href attribute,
with "#" and "-nt" tacked on).  Alternatively, the element could have been
empty, and its attribute value both used as an address and rendered (with
some transformation that probably isn't worth doing...).  Either way,
nothing would be duplicated in the source.  However, it would make me a
little uncomfortable treating the same string as having two functions.

>- it is a shame that there is no tagging to distinguish terminal symbols
>from metasymbols, since this would enable nicer renditions of the rules,
>e.g. exploiting colour, without having to parse the BNF

I'll take this up with the other editors using the DTD.

>- it would seem more logical for each rule to have a single <rhs>, with any
><vc> and <wfc> constraints being embedded within the <rhs>, rather than
>these being separate elements interspersed among multiple <rhs> elements.

We had a lengthy discussion of whether our production markup should be more
semantic and less presentational.  It's so much work to make the markup
simulate the EBNF and to make the filters handle this, that we decided not
to go further in that direction.  I do agree that the production markup is
less than "pure" in this area.

	Eve

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ak117 at freenet.carleton.ca  Wed Feb 25 00:56:20 1998
From: ak117 at freenet.carleton.ca (David Megginson)
Date: Mon Jun  7 17:00:12 2004
Subject: multiple handlers
In-Reply-To: <000d01bd417e$521bcbc0$2ee044c6@donpark>
References: <000d01bd417e$521bcbc0$2ee044c6@donpark>
Message-ID: <199802250054.TAA00347@unready.microstar.com>

Don Park writes:

 > I think we have four choices at this point:
 > 
 > 1. Leave SAX alone!
 > 2. Add some but as little as possible.
 > 3. Go nuts and let SAX bloat as the months go by.
 > 4. Start EZ-SAX (sorry, I couln't help it.  David picked a name ready-made
 > for puns) package to complement SAX.
 > 
 > Personally, I am all for EZ-SAX ;-p.

I think that it will be a wonderful idea for people to implement
higher-level, programmer-friendly stuff on top of SAX.  Exactly what
_is_ programmer friendly will depend on the programming language, so I
agree that the helper classes should stay out of the SAX core, but I
encourage any efforts to make SAX programmers' lives easier (as Don
has done with SAXDOM).


All the best,


David

-- 
David Megginson                 ak117@freenet.carleton.ca
Microstar Software Ltd.         dmeggins@microstar.com
      http://home.sprynet.com/sprynet/dmeggins/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ak117 at freenet.carleton.ca  Wed Feb 25 01:28:02 1998
From: ak117 at freenet.carleton.ca (David Megginson)
Date: Mon Jun  7 17:00:12 2004
Subject: SAX: org.xml.sax.AttributeMap
Message-ID: <199802250126.UAA00473@unready.microstar.com>

We may as well take up the most difficult interface next, to get it
over with.  Here's what we have right now for attributes, which are by
far the most vexed problem in SAX:

  package org.xml.sax;

  import java.util.Enumeration;

  public interface AttributeMap {

    public Enumeration getAttributeNames ();
    public String getValue (String attributeName);

    public boolean isEntity (String attributeName);
    public boolean isNotation (String attributeName);
    public boolean isId (String attributeName);
    public boolean isIdref (String attributeName);

    public String getEntityPublicID (String attributeName);
    public String getEntitySystemID (String attributeName);
    public String getNotationName (String attributeName);
    public String getNotationPublicID (String attributeName);
    public String getNotationSystemID (String attributeName);

  }

BOY, DO I WANT TO CHANGE THIS ONE.  James has made some good
suggestions about how to make this simpler and more efficient by
working from list indexes (it also avoids the need to allocate an
Enumeration).  Here's what I want to change:

1. Rename the interface to org.xml.sax.AttributeList to reflect the
   new approach.

2. Add a method to return the length of the list.

3. Look up attribute information based on integer indices rather than
   string values.

4. Eliminate the is*() methods, and add a single method to return the
   attribute's type as a string instead.

5. Rename getNotationName() to getEntityNotationName() to make its
   role clearer.

With these changes, we end up with the following, somewhat simpler
interface:

  package org.xml.sax;

  public interface AttributeList {

    public abstract int getLength ();
    public abstract int getName (int index);
    public abstract int getValue (int index);
    public abstract String getType (int index);

    public abstract String getEntityNotationName (int index);
    public abstract String getEntityPublicId (int index);
    public abstract String getEntitySystemId (int index);
    public abstract String getNotationPublicId (int index);
    public abstract String getNotationSystemId (int index);

  }

The first four methods are actually very nice now (thanks, James, for
the suggestion).  As specified in the XML REC, getType() will return
"CDATA" if there is no explicit declaration, and it will return the
declared attribute type otherwise.  There's also no further dependency
on the Java-specific Enumeration class, so C++ programmers can sigh a
sigh of relief.

The last five methods are much more of a problem, and I'm still
agonizing over what to do.  Why do we have binary entities in XML at
all?  Is anyone going to use them, or will everything be done with
href's?

Attributes are the _only_ way to get at binary entities in XML, so if
I don't provide some way to get access to them here, then SAX parsers
and applications make it impossible to use binary (NDATA) entities at
all.  I am very reluctant to create a new class or interface just for
entities (and yet another for notations), when other types of objects
do not have their own classes, and I certainly don't want to re-invent
(or pre-invent) the DOM.  


HELP!!!


David

-- 
David Megginson                 ak117@freenet.carleton.ca
Microstar Software Ltd.         dmeggins@microstar.com
      http://home.sprynet.com/sprynet/dmeggins/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ak117 at freenet.carleton.ca  Wed Feb 25 01:39:56 1998
From: ak117 at freenet.carleton.ca (David Megginson)
Date: Mon Jun  7 17:00:12 2004
Subject: SAX: org.xml.sax.AttributeMap
In-Reply-To: <199802250126.UAA00473@unready.microstar.com>
References: <199802250126.UAA00473@unready.microstar.com>
Message-ID: <199802250138.UAA00524@unready.microstar.com>

David Megginson writes:

 >   package org.xml.sax;
 > 
 >   public interface AttributeList {
 > 
 >     public abstract int getLength ();
 >     public abstract int getName (int index);
 >     public abstract int getValue (int index);
 >     public abstract String getType (int index);
 > 
 >     public abstract String getEntityNotationName (int index);
 >     public abstract String getEntityPublicId (int index);
 >     public abstract String getEntitySystemId (int index);
 >     public abstract String getNotationPublicId (int index);
 >     public abstract String getNotationSystemId (int index);
 > 
 >   }

For any of you who are wondering when attribute names and values
became integers, the above should have been

  package org.xml.sax;

  public interface AttributeList {

    public abstract int getLength ();
    public abstract String getName (int index);
    public abstract String getValue (int index);
    public abstract String getType (int index);

    public abstract String getEntityNotationName (int index);
    public abstract String getEntityPublicId (int index);
    public abstract String getEntitySystemId (int index);
    public abstract String getNotationPublicId (int index);
    public abstract String getNotationSystemId (int index);

  }


All the best,


David

-- 
David Megginson                 ak117@freenet.carleton.ca
Microstar Software Ltd.         dmeggins@microstar.com
      http://home.sprynet.com/sprynet/dmeggins/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at quake.net  Wed Feb 25 02:00:39 1998
From: donpark at quake.net (Don Park)
Date: Mon Jun  7 17:00:13 2004
Subject: org.xml.sax.AttributeMap
Message-ID: <001301bd4190$726aa650$2ee044c6@donpark>

David,

>The last five methods are much more of a problem, and I'm still
>agonizing over what to do.  Why do we have binary entities in XML at
>all?  Is anyone going to use them, or will everything be done with
>href's?
>
>Attributes are the _only_ way to get at binary entities in XML, so if
>I don't provide some way to get access to them here, then SAX parsers
>and applications make it impossible to use binary (NDATA) entities at
>all.  I am very reluctant to create a new class or interface just for
>entities (and yet another for notations), when other types of objects
>do not have their own classes, and I certainly don't want to re-invent
>(or pre-invent) the DOM.

How about replacing the five with following method and three constants?

public static final int NAME = 0;
public static final int PUBLIC_ID = 1;
public static final int SYSTEM_ID = 2;

public abstract String[] getDataInfo (int index);

Since AttributeList is valid only within startElement method, you can reuse
a single string array rather allocate a new one per getEntityInfo method.
If the method returns null, then it is attribute has no info.

If you haven't guessed by now, the constants above are used to index into
the returned array.  Implementations should take steps to make sure the size
of the returned array is 3 and stuff null for NAME if it is not a notation.

Does this help?

Don Park
http://www.quake.net/~donpark/index.html

>
>
>HELP!!!
>
>
>David
>
>--
>David Megginson                 ak117@freenet.carleton.ca
>Microstar Software Ltd.         dmeggins@microstar.com
>      http://home.sprynet.com/sprynet/dmeggins/
>
>xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
>Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
>To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
>(un)subscribe xml-dev
>To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
>subscribe xml-dev-digest
>List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From antony at n-space.com.au  Wed Feb 25 02:24:42 1998
From: antony at n-space.com.au (Antony Blakey)
Date: Mon Jun  7 17:00:13 2004
Subject: org.xml.sax.AttributeMap
References: <001301bd4190$726aa650$2ee044c6@donpark>
Message-ID: <34F37FDC.D945B484@n-space.com.au>

Don Park wrote:
> How about replacing the five with following method and three constants?
> 
> public static final int NAME = 0;
> public static final int PUBLIC_ID = 1;
> public static final int SYSTEM_ID = 2;
> 
> public abstract String[] getDataInfo (int index);
> 
> Since AttributeList is valid only within startElement method, you can reuse
> a single string array rather allocate a new one per getEntityInfo method.
> If the method returns null, then it is attribute has no info.
> 
> If you haven't guessed by now, the constants above are used to index into
> the returned array.  Implementations should take steps to make sure the size
> of the returned array is 3 and stuff null for NAME if it is not a notation.

Why would you not simply return a strongly typed data item (ignoring the
names)

public abstract DataInfo getDataInfo(int index);

public interface EntityInfo {
  public String getName();
  public String getPublicID();
  Public String getSystemID();
}

As far as reuse of values is concerned however, I think this is a very
bad idea: startElement defines a new context, so reusing the parameters
to that call is workable, however reusing the result from the
getDataInfo call is a different kettle of fish. It would be better (if
you are so concerned) to keep a pool that you return so that they are
not reused within the context of a startElement call. This may seem like
more work on the part of the parser implementor, but you shouldn't push
this complexity onto the users of the parser when you can safely hide it
within the parser. The parser writer can make the effort for
efficiencies sake.

+----------------------------------+
|          Antony Blakey           |
|         N-Space Pty Ltd          |
|    Java - CORBA - SGML - XML     |
|   mailto:antony@n-space.com.au   |
|     http://www.n-space.com.au    |
+----------------------------------+

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jjc at jclark.com  Wed Feb 25 02:46:47 1998
From: jjc at jclark.com (James Clark)
Date: Mon Jun  7 17:00:13 2004
Subject: SAX: org.xml.sax.AttributeMap
References: <199802250126.UAA00473@unready.microstar.com> <199802250138.UAA00524@unready.microstar.com>
Message-ID: <34F38506.3D19B68A@jclark.com>

>   package org.xml.sax;
> 
>   public interface AttributeList {
> 
>     public abstract int getLength ();
>     public abstract String getName (int index);
>     public abstract String getValue (int index);

I think it's also desirable to provide a method to access attribute
values by name.  Some applications only want to access attribute values
this way, and it's inconvenient and inefficient for the application to
have to iterate over all the names itself.

>     public abstract String getType (int index);

I like this.

>     public abstract String getEntityNotationName (int index);
>     public abstract String getEntityPublicId (int index);
>     public abstract String getEntitySystemId (int index);
>     public abstract String getNotationPublicId (int index);
>     public abstract String getNotationSystemId (int index);

I agree that SAX ought to provide access to unparsed entities but I
don't think this is the right way to achieve it.  For a start, I can
have an ENTITIES attribute, so all these methods would need two
arguments (the index of the attribute in the attribute list, and the
index of the token in the value).

Another problem is that it is common to declare unparsed entities in the
internal subset, but to declare attribute types in an external DTD, eg

<!DOCTYPE doc SYSTEM "doc.dtd" [
<!ENTITY foo SYSTEM "foo.pic" NDATA gif>
]>
<doc><picture ref="foo"/></doc>

where doc.dtd contains

<!ATTLIST picture ref ENTITY #IMPLIED>

Now if I parse this without processing the external DTD, the SAX
interface as I understand it won't allow be to get at the system and
public id for foo, although an application might well intrinsically know
that ref is an ENTITY attribute.

I think a better approach is for the processor at the end of the prolog
to pass an object to the application that provides information about all
the declared notations and unparsed entities.

XP has a DTD object that does this, but it might be better to call it
something else (like UnparsedEntitySet) since SAX might someday be
extended to provide full DTD access.

Note that if you provide access to the system ID, you have to deal with
the issue of relative URLs.  Either the processor has to resolve a
relative URL into an absolute URL before passing to the application, or
it ha to make available a base URL to the application.

Here's what XP's DTD interface looks like (it's a little fancier than
what's I think is needed for SAX in that it provides access to all
general entities not just unparsed ones):

package com.jclark.xml.parse;

import java.util.Enumeration;
import java.net.URL;

/**
 * Information about a DTD.
 * @version $Revision: 1.4 $ $Date: 1998/02/17 04:20:20 $
 */
public interface DTD {
  /**
   * Returns an enumeration over the names of general entities declared
in
   * the DTD.
   */
  Enumeration entityNames();
  /**
   * Returns an enumeration over the names of notations declared in
   * the DTD.
   */
  Enumeration notationNames();
  /**
   * Returns the system identifier for a notation.
   * Returns null if the notation was not declared or no system
identifier
   * was specified.
   * A relative URL is not automatically resolved into an absolute URL;
   * <code>getNotationBase</code> can be used to do this.
   *
   * @see #getNotationBase
   */
  String getNotationSystemId(String notationName);
  /**
   * Returns the public identifier for a notation.
   * Returns null if the notation was not declared or no public
identifier
   * was specified.
   */
  String getNotationPublicId(String notationName);
  /**
   * Returns the URL of the entity in which the notation was declared.
   * Returns null if the entity was not declared or the URL of the
   * declaring entity is not available.
   */
  URL getNotationBase(String notationName);
  /**
   * Returns the replacement text of the specified general entity.
   * Returns null if the entity was not declared or was
   * as an external entity.
   */
  String getEntityReplacementText(String entityName);
  /**
   * Returns the system identifier for a general entity.
   * Returns null if the entity was not declared or is an internal
entity.
   * A relative URL is not automatically resolved into an absolute URL;
   * <code>getNotationBase</code> can be used to do this.
   *
   * @see #getEntityBase
   */
  String getEntitySystemId(String entityName);
  /**
   * Returns the public identifier for a general entity.
   * Returns null if the entity was not declared or no public identifier
   * was specified.
   */
  String getEntityPublicId(String entityName);
  /**
   * Returns the name of the notation of an unparsed general entity.
   * Returns null if the entity was not declared or was a parsed entity.
   */
  String getEntityNotationName(String entityName);
  /**
   * Returns the URL of the entity in which the general entity was
declared.
   * Returns null if the entity was not declared or the URL of the
   * declaring entity is not available.
   */
  URL getEntityBase(String entityName);
  /**
   * Returns true if an element type was declared to have element
content.
   */
  boolean getElementTypeElementContent(String elementTypeName);
  /**
   * Returns true if the complete DTD was processed.
   */
  boolean isComplete();
  /**
   * Returns true if <code>standalone="yes"</code> was specified in the
   * XML declaration.
   */
  boolean isStandalone();
}

James


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jjc at jclark.com  Wed Feb 25 03:04:23 1998
From: jjc at jclark.com (James Clark)
Date: Mon Jun  7 17:00:13 2004
Subject: SAX: finalising org.sax.xml.Parser
References: <199802230313.WAA00386@unready.microstar.com>
Message-ID: <34F38943.551F05AB@jclark.com>

>    public void parse (InputStream is, String baseURI)
>      throws java.lang.Exception;

XML allows the encoding of an entity being specified by an external
transport protocol (see 4.3.3): for example, when an XML document
arrives over HTTP with a content type of text/xml, then the encoding
specified in the charset parameter is supposed to take precedence over
that specified in the document entity by the encoding declaration or by
XML's default rules.  So I think we need an additional argument here: a
String specifying the name of the encoding to be used for the
InputStream, or null if the encoding specified in the document entity
should be used.

James


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at quake.net  Wed Feb 25 07:21:38 1998
From: donpark at quake.net (Don Park)
Date: Mon Jun  7 17:00:13 2004
Subject: org.xml.sax.AttributeMap
Message-ID: <002301bd41bd$49737650$2ee044c6@donpark>

>Why would you not simply return a strongly typed data item (ignoring the
>names)

Because we are trying to minimize the number of classes to the bare minimum.
I don't feel too strongly about the goal but I felt I should make a
suggestion.

>As far as reuse of values is concerned however, I think this is a very
>bad idea: startElement defines a new context, so reusing the parameters
>to that call is workable, however reusing the result from the
>getDataInfo call is a different kettle of fish. It would be better (if
>you are so concerned) to keep a pool that you return so that they are
>not reused within the context of a startElement call. This may seem like
>more work on the part of the parser implementor, but you shouldn't push
>this complexity onto the users of the parser when you can safely hide it
>within the parser. The parser writer can make the effort for
>efficiencies sake.

What I suggested is not any worse than AttributeMap being reused by some of
the parsers since the returned value's lifetime is entirely bound by
lifetime of AttributeMap.  Note that AttributeMap's Enumeration is also
invalid once startElement returns.  But then I am not at all saying that
what I suggest is good.

One of the problem facing SAX is its speed.  There are far too much objects
(mainly Strings) being instantiated unnecessarily because of multiple layers
involved.  One of the users of SAXDOM measured performance at three levels
(SAX, SAXDOM, and his own application on top of SAXDOM) and found that
performance decreased by about 50% at each level.  Processing of a 1.5 meg
XML file took 8 seconds at SAX level, 14 seconds at SAXDOM, and 35 seconds
at the application level.  I don't know which SAX parser was used.

Since I have a particular interest in server-side XML processing, I have a
real concern about performance.  I am currently feeling out the issues on
building a 'pedal-to-the-metal' XML parser with native SAX support.
Actually, I am finding that my performance goals can not be met with current
SAX API because I must cut down object instantiation down to bare minimum,
remove most synchronization, and cluster each stage to allow JIT more
effective use of CPU code cache.

Don Park
http://www.quake.net/~donpark/index.html


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From peter at ursus.demon.co.uk  Wed Feb 25 09:01:06 1998
From: peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun  7 17:00:13 2004
Subject: multiple handlers
In-Reply-To: <199802250054.TAA00347@unready.microstar.com>
References: <000d01bd417e$521bcbc0$2ee044c6@donpark>
 <000d01bd417e$521bcbc0$2ee044c6@donpark>
Message-ID: <3.0.1.16.19980225085026.357795b8@pop3.demon.co.uk>

At 19:54 24/02/98 -0500, David Megginson wrote:
[...]
>
>I think that it will be a wonderful idea for people to implement
>higher-level, programmer-friendly stuff on top of SAX.  Exactly what
>_is_ programmer friendly will depend on the programming language, so I
>agree that the helper classes should stay out of the SAX core, but I
>encourage any efforts to make SAX programmers' lives easier (as Don
>has done with SAXDOM).
>
Although it may not formally be part of SAX, I think it will be extremely
valuable to have reference library implementations of parts of the spec.
For example, what is a valid Name in XML? You have to treat a large number
of special cases for characters, and are extremely vulnerable to revisions
of the spec (this is an area where I am sure minor revisions will happen).
So a set of library classes of the type:
	public static boolean isValidName(String name);
	public static String getCaseSpaceNormalizedAttval(String value);
would be extremely valuable. We can then delegate part of the prose to
these implementations.

	P.

Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic
net connection
VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary
http://www.venus.co.uk/vhg

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From peter at ursus.demon.co.uk  Wed Feb 25 09:19:40 1998
From: peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun  7 17:00:13 2004
Subject: The XML spec in XML: missing tags
In-Reply-To: <3.0.5.32.19980224192743.00a0d220@village.doctools.com>
References: <98Feb23.120356est.18826@thicket.arbortext.com>
Message-ID: <3.0.1.16.19980225083228.2a27adb6@pop3.demon.co.uk>

[... I may have missed the postings quoted in this...]
At 19:27 24/02/98 -0500, Eve L. Maler wrote:
>As the maintainer of the specification DTD, let me say thanks for your
>comments.

We are very grateful to Eve for having produced the markup specification.
Unfortunately she is a victim of her success in that rec.xml [my shorthand
for the spec] is the first 'really crunchy official piece of XML' that we
can get to grips with for learning and developing our tools. This is why a
DTD and its associated semantics/documentation is so important :-). [I
would also expect that 'spec.dtd' might be re-usable in other contexts.]
>
[...]
>
>We had a lengthy discussion of whether our production markup should be more
>semantic and less presentational.  It's so much work to make the markup
>simulate the EBNF and to make the filters handle this, that we decided not
>to go further in that direction.  I do agree that the production markup is
>less than "pure" in this area.
>
My interest is similar - but complementary - to Michael's; I am interested
in the terminology. Thus I want to be able to abstract the terms [there are
62 termdefs] in the document and produce a model for their structure (e.g.
entailment by containment, by linking and so on.) In this way I can create
a graphical interactive map of the concepts in the XML spec and have
already created a prototype. I would like to know, for example, whether all
terms are defined by <termdef> or whether there are some which are simply
defined by <term>foo bar</term>. There appears to be some duplication here
as well; thus a termdef has an attribute naming the term, but it is also
often contained within a <term> later in the 'description'.  [And there is
at least one case where </termdef> occurs in mid-sentence - I suspect this
isn't intended.]

	P.

Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic
net connection
VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary
http://www.venus.co.uk/vhg

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From peter at ursus.demon.co.uk  Wed Feb 25 09:20:45 1998
From: peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun  7 17:00:13 2004
Subject: SAX: org.xml.sax.AttributeMap
In-Reply-To: <34F38506.3D19B68A@jclark.com>
References: <199802250126.UAA00473@unready.microstar.com>
 <199802250138.UAA00524@unready.microstar.com>
Message-ID: <3.0.1.16.19980225084239.3577ae48@pop3.demon.co.uk>

At 09:42 25/02/98 +0700, James Clark wrote:
>
>>     public abstract String getType (int index);
>
>I like this.
>
So do I. As XML grows larger and acquires more extensions (XLL, XSL, etc.)
there will be an increasing number of 'hardcoded' attribute types and
values. For example, the type of HREF/href is effectively determined as
CDATA (it would be perverse to make it ID, for example, even if not in
xml-link context) and xml:lang is required (I think) to be NMTOKEN or
NMTOKENS. Hardcoding all these 'special cases' is a pain and SAX (or DOM)
can help with implementing the prose in the specs.

	P.
Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic
net connection
VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary
http://www.venus.co.uk/vhg

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tms at ansa.co.uk  Wed Feb 25 10:43:46 1998
From: tms at ansa.co.uk (Toby Speight)
Date: Mon Jun  7 17:00:13 2004
Subject: SAX: org.xml.sax.AttributeMap
In-Reply-To: David Megginson's message of "Tue, 24 Feb 1998 20:38:34 -0500"
References: <199802250126.UAA00473@unready.microstar.com> <199802250138.UAA00524@unready.microstar.com>
Message-ID: <s8yaz0artw.fsf@plato.ansa.co.uk>

David> David Megginson <URL:mailto:ak117@freenet.carleton.ca>

=> In article <199802250138.UAA00524@unready.microstar.com>, David
=> wrote:

David> David Megginson writes:

David>   package org.xml.sax;
David>
David>   public interface AttributeList {
David>
David>     //...
David>     public abstract String getType (int index);
David>     //...
David>
David>   }

We're returning one of a bounded, known set of values.  I'd prefer to
use an int for this type of thing, along with a set of constants.
I.e.

    public abstract String getType (int index);
    public static final int CDATA = 0;
    public static final int NMTOKEN = 1;
    // etc.

The only advantage a String has over this is that you can meaningfully
present it to the user as it is.  A disadvantage of String is that it is
computationally expensive to compare for equality (or equivalently, and
worse, to switch() on it).  Comparison becomes easier if one provides a
set of String constants and guarantees that returned values will test
equal with "==".  That is not too different to my suggestion of using
numeric constants.


Converting integers to human-readable Strings is easy:

   public static String[] typeNames = new String[/* some size */];
   static {
       typeNames[CDATA]   = "CDATA";
       typeNames[NMTOKEN] = "NMTOKEN";
       // etc.
   }

but I don't think this needs to be part of the interface.

One might wish to use short or char instead of int if storage space is
at a premium; I'm making no judgement on which arithmetic type is
best.

This proposal is not Java-specific.

-- 

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From M.H.Kay at eng.icl.co.uk  Wed Feb 25 11:36:23 1998
From: M.H.Kay at eng.icl.co.uk (Michael Kay)
Date: Mon Jun  7 17:00:13 2004
Subject: helper classes for SAX
Message-ID: <01bd41e1$be824f60$1e09e391@mhklaptop.bra01.icl.co.uk>

>"Should we add helper classes to SAX?"
>
I have written a package on top of SAX which I hope to publish soon - I need
to get it past some corporate processes


I wrote it because I found I was doing the same thing repeatedly in a number
of SAX applications. I call the package SAXON (sorry), and it provides the
following services:

- allows you to register a handler for a particular element type (or a
particular element type in the context of a parent element type). The
handler can supply methods to process the element start or end, the
character data or ignorable white space in the element, or the start or end
of a consecutive group of one or more elements (cf. XSL)
- provides you with context information about the element; in particular,
its parent and ancestors, their attributes, and also their elder sibling
elements.
- allows you to associate user data with an element, so for example your
start-element method can pass data to the corresponding end-element method
- allows you to associate an output "bucket" with an element type, so that
all output for that element and its children (unless otherwise specified)
goes into that bucket. Useful for splitting documents and for limited
re-ordering of elements
- allows multiple handlers per element type
- includes some standard element handlers for doing HTML rendition, for
generating automatic numbering, etc

Although I'm not in a position to go public with it yet, I'll be happy to
share the current state of development with any individual who wants to
collaborate.

I do realise of course that some of these facilities can be achieved by
using the DOM instead of an event-based parser, and there is a world of
stuff in JUMBO that I haven't expored yet. I was trying to add value to SAX
without going heavyweight, which of course is a delicate line to tread.

Regards, Mike Kay
ICL


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From digitome at iol.ie  Wed Feb 25 13:42:56 1998
From: digitome at iol.ie (Sean Mc Grath)
Date: Mon Jun  7 17:00:13 2004
Subject: SAX: org.xml.sax.AttributeMap
Message-ID: <199802251342.NAA04905@mail.iol.ie>

[Toby Speight]
>
>We're returning one of a bounded, known set of values.  I'd prefer to
>use an int for this type of thing, along with a set of constants.
>I.e.
>
>    public abstract String getType (int index);
>    public static final int CDATA = 0;
>    public static final int NMTOKEN = 1;
>    // etc.
>
>The only advantage a String has over this is that you can meaningfully
>present it to the user as it is.  A disadvantage of String is that it is
>computationally expensive to compare for equality (or equivalently, and
>worse, to switch() on it).


If ints are going to be used, lets use values that can
be bit-twiddled.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ak117 at freenet.carleton.ca  Wed Feb 25 14:12:10 1998
From: ak117 at freenet.carleton.ca (David Megginson)
Date: Mon Jun  7 17:00:13 2004
Subject: SAX: finalising org.sax.xml.Parser
In-Reply-To: <34F38943.551F05AB@jclark.com>
References: <199802230313.WAA00386@unready.microstar.com>
	<34F38943.551F05AB@jclark.com>
Message-ID: <199802251410.JAA00633@unready.microstar.com>

James Clark writes:

 > XML allows the encoding of an entity being specified by an external
 > transport protocol (see 4.3.3): for example, when an XML document
 > arrives over HTTP with a content type of text/xml, then the
 > encoding specified in the charset parameter is supposed to take
 > precedence over that specified in the document entity by the
 > encoding declaration or by XML's default rules.  So I think we need
 > an additional argument here: a String specifying the name of the
 > encoding to be used for the InputStream, or null if the encoding
 > specified in the document entity should be used.

This is a very good point, as was the suggestion earlier (I don't
remember whose it was) that we rearrange arguments in order of
decreasing importance to the programmer.  With those suggestions in
mind, here's my current take on org.xml.sax.Parser:

  package org.xml.sax;

  public interface Parser {

    public abstract void setEntityHandler (EntityHandler handler);
    public abstract void setDocumentHandler (DocumentHandler handler);
    public abstract void setErrorHandler (ErrorHandler handler);

    public abstract void parse (String systemId, String publicId)
      throws java.lang.Exception;

    public abstract void parse (InputStream input, String encoding,
				String systemId, String publicId)
      throws java.lang.Exception;

  }

I haven't included a setValidate() method yet, partly because I'm not
certain what it would really mean.  If I did

  setValidate(false);

would that simply prevent the reporting of validation errors, or would
it also prohibit the parser from resolving external text entities and
the external DTD subset?


All the best,


David

-- 
David Megginson                 ak117@freenet.carleton.ca
Microstar Software Ltd.         dmeggins@microstar.com
      http://home.sprynet.com/sprynet/dmeggins/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jmodre at edu.uni-klu.ac.at  Wed Feb 25 15:08:56 1998
From: jmodre at edu.uni-klu.ac.at (Juergen Modre)
Date: Mon Jun  7 17:00:13 2004
Subject: SAX: finalising org.sax.xml.Parser
References: <199802230313.WAA00386@unready.microstar.com>
		<34F38943.551F05AB@jclark.com> <199802251410.JAA00633@unready.microstar.com>
Message-ID: <34F44259.5887FF1F@edu.uni-klu.ac.at>

David Megginson wrote:
> This is a very good point, as was the suggestion earlier (I don't
> remember whose it was) that we rearrange arguments in order of
> decreasing importance to the programmer.
I think it was Don Park and I also like it.

> With those suggestions in
> mind, here's my current take on org.xml.sax.Parser:
> 
>   package org.xml.sax;
> 
>   public interface Parser {
> 
>     public abstract void setEntityHandler (EntityHandler handler);
>     public abstract void setDocumentHandler (DocumentHandler handler);
>     public abstract void setErrorHandler (ErrorHandler handler);
> 
>     public abstract void parse (String systemId, String publicId)
>       throws java.lang.Exception;
> 
>     public abstract void parse (InputStream input, String encoding,
>                                 String systemId, String publicId)
>       throws java.lang.Exception;
> 
>   }
I think Don's suggestion was also to have it like
     public abstract void parse (String systemId, String publicId, String encoding, InputStream
input)

so that the first parameter part is always the same.
So if another constructor will be added only the the last parameter will differ.

> I haven't included a setValidate() method yet, partly because I'm not
> certain what it would really mean.  If I did
> 
>   setValidate(false);
> 
> would that simply prevent the reporting of validation errors, or would
> it also prohibit the parser from resolving external text entities and
> the external DTD subset?

It should have the following meaning:
- setValidate(false);
  That the document/stream should be parsed for well-formedness.
  This should also be the default if nothing was set with the setValidate() method.

- setValidate(true);
  That the document/stream should also be validated during parsing.

The question where there is exactly the border between well-formedness
parsing and validation parsing should be left to the parser. This border
can be found in the XML spec.
The SAX interface is/should be useable for both classes of XML parsers
and give also the possibility to enable/disable validation.


But I agree that it is sometimes not easy to see the clear border
between well-formedness parsing and validation parsing in the XML spec.


All the best
 Juergen

-----------------------------------------------
 JUERGEN MODRE
 Reisdorf 6
 A-9371 Brueckl
 Austria (Europe)

 Phone:   ++43 4214 2320
 Mobile:  ++43 664 233 22 22
 E-mail:  jmodre@edu.uni-klu.ac.at
 WWW:     http://www.edu.uni-klu.ac.at/~jmodre
-----------------------------------------------

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From elm at arbortext.com  Wed Feb 25 15:51:03 1998
From: elm at arbortext.com (Eve L. Maler)
Date: Mon Jun  7 17:00:13 2004
Subject: The XML spec in XML: missing tags
In-Reply-To: <98Feb25.042025est.18818@thicket.arbortext.com>
References: <3.0.5.32.19980224192743.00a0d220@village.doctools.com>
 <98Feb23.120356est.18826@thicket.arbortext.com>
Message-ID: <3.0.5.32.19980225104757.00a13120@village.doctools.com>

Oh, you want *documentation*, do you??  Well, the DTD was hard to write; it
should be hard to understand. :-)

Seriously, I keep saying that I'll release the reference documentation Real
Soon Now, and in fact I'm hoping to be able to spend a few hours tidying it
up and releasing it later this week.  (There's also a minor DTD update in
the pipe.)

At 03:32 AM 2/25/98 -0500, Peter Murray-Rust wrote:
>My interest is similar - but complementary - to Michael's; I am interested
>in the terminology. Thus I want to be able to abstract the terms [there are
>62 termdefs] in the document and produce a model for their structure (e.g.
>entailment by containment, by linking and so on.) In this way I can create
>a graphical interactive map of the concepts in the XML spec and have
>already created a prototype. I would like to know, for example, whether all
>terms are defined by <termdef> or whether there are some which are simply
>defined by <term>foo bar</term>. There appears to be some duplication here
>as well; thus a termdef has an attribute naming the term, but it is also
>often contained within a <term> later in the 'description'.  [And there is
>at least one case where </termdef> occurs in mid-sentence - I suspect this
>isn't intended.]

<termdef> is a really odd way to do term definitions, for my money, but
that's what the users wanted. :-)  It captures an "inline" definition of a
term, and because of the mixed content model, it can't even ensure that a
<term> is present to identify the actual term being defined.  Likewise, it
can't ensure that the definition captured functions as a "standalone"
sentence or set of sentences.  I suspect that the cut-off sentence was more
in the spirit of poetic license.

<term> is occasionally used legitimately without a <termdef> wrapper; it's
marking a term being used in a special way, without an accompanying
definition.

Gee, maybe I should just collect all the questions and do the documentation
as a Q&A...

	Eve

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From matthewg at poet.de  Wed Feb 25 16:01:23 1998
From: matthewg at poet.de (Matthew Gertner)
Date: Mon Jun  7 17:00:13 2004
Subject: SAX: finalising org.sax.xml.Parser
Message-ID: <01bd4206$6ccb6d30$a00b0ac0@pharcyde.poetsoftware.xo.com>

>It should have the following meaning:
>- setValidate(false);
>  That the document/stream should be parsed for well-formedness.
>  This should also be the default if nothing was set with the setValidate()
method.
>
>- setValidate(true);
>  That the document/stream should also be validated during parsing.


How about a 2x2 matrix?

With DTD
    setValidate(false) - checks for well-formedness, external subset is used
for entity and notation declarations, etc.
    setValidate(true) - full validation

Without DTD
    setValidate(false) - just checks for well-formedness
    setValidate(true) - throws an exception

Matthew


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From peter at ursus.demon.co.uk  Wed Feb 25 17:15:05 1998
From: peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun  7 17:00:13 2004
Subject: helper classes for SAX
In-Reply-To: <01bd41e1$be824f60$1e09e391@mhklaptop.bra01.icl.co.uk>
Message-ID: <3.0.1.16.19980225161441.20a7fe70@pop3.demon.co.uk>

At 11:37 25/02/98 -0000, Michael Kay wrote:
>>"Should we add helper classes to SAX?"
>>
>I have written a package on top of SAX which I hope to publish soon - I need
>to get it past some corporate processes

I understand the problem :-)

>
>
>I wrote it because I found I was doing the same thing repeatedly in a number
>of SAX applications. I call the package SAXON (sorry), and it provides the
>following services:
>
>- allows you to register a handler for a particular element type (or a
>particular element type in the context of a parent element type). The
>handler can supply methods to process the element start or end, the
>character data or ignorable white space in the element, or the start or end
>of a consecutive group of one or more elements (cf. XSL)
>- provides you with context information about the element; in particular,
>its parent and ancestors, their attributes, and also their elder sibling
>elements.

This is useful. I found myself doing the same sort of thing. In a
tree-based situation it's easy - I use XLL XPtrs repeatedly. I missed these
when I came to implement some things on top of SAX.

>- allows you to associate user data with an element, so for example your
>start-element method can pass data to the corresponding end-element method
>- allows you to associate an output "bucket" with an element type, so that
>all output for that element and its children (unless otherwise specified)
>goes into that bucket. Useful for splitting documents and for limited
>re-ordering of elements

Yes. This is partly what my (very simple) SAXSplit does - splits documents
into smaller bits.

There was discussion at one stage that XML should have a transformation
language. Personally I would welcome this. XSL goes half the way in
providing a way of identifying components to be split, re-ordered,
transformed, etc. but concentrates on graphic rendering for humans. 

>- allows multiple handlers per element type
>- includes some standard element handlers for doing HTML rendition, for
>generating automatic numbering, etc

I'd certainly like someone else to write code for HTML if that is what is
being offered :-)

>
>Although I'm not in a position to go public with it yet, I'll be happy to
>share the current state of development with any individual who wants to
>collaborate.

:-)

>
>I do realise of course that some of these facilities can be achieved by
>using the DOM instead of an event-based parser, and there is a world of

The attraction of SAX is that:
	- it is simpler for XML newbies to understand
	- you don't have to hold everything in memory
	
>stuff in JUMBO that I haven't expored yet. I was trying to add value to SAX

JUMBO mainly consists of large muddy footprints. Seriously, I would be
happy to lose any generic functionality from JUMBO if a better way arises.
For example, I use SAX+FOO as the parser and can see a move towards DOM for
defining the tree/grove components. When/if I'm happy to go to J1.1 I will
seriously consider the Swing JTree, though there are bits I find missing at
present.
I am not clear what other features are modular but I am sure many are. 


	P.

Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic
net connection
VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary
http://www.venus.co.uk/vhg

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ak117 at freenet.carleton.ca  Wed Feb 25 17:26:06 1998
From: ak117 at freenet.carleton.ca (David Megginson)
Date: Mon Jun  7 17:00:13 2004
Subject: SAX: finalising org.sax.xml.Parser
In-Reply-To: <01bd4206$6ccb6d30$a00b0ac0@pharcyde.poetsoftware.xo.com>
References: <01bd4206$6ccb6d30$a00b0ac0@pharcyde.poetsoftware.xo.com>
Message-ID: <199802251724.MAA02583@unready.microstar.com>

Matthew Gertner writes:

 > How about a 2x2 matrix?
 > 
 > With DTD
 >     setValidate(false) - checks for well-formedness, external subset is used
 > for entity and notation declarations, etc.
 >     setValidate(true) - full validation
 > 
 > Without DTD
 >     setValidate(false) - just checks for well-formedness
 >     setValidate(true) - throws an exception

This comes back to the original problem, however: what if I want to
include the external subset and external text entities but don't want
to validate?  I'm not sure that the two should be tied together
(AElfred, for example, does not validate, but it does use the DTD).


All the best,


David

-- 
David Megginson                 ak117@freenet.carleton.ca
Microstar Software Ltd.         dmeggins@microstar.com
      http://home.sprynet.com/sprynet/dmeggins/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From peter at ursus.demon.co.uk  Wed Feb 25 18:47:36 1998
From: peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun  7 17:00:13 2004
Subject: SAX: finalising org.sax.xml.Parser
In-Reply-To: <34F44259.5887FF1F@edu.uni-klu.ac.at>
References: <199802230313.WAA00386@unready.microstar.com>
 <34F38943.551F05AB@jclark.com>
 <199802251410.JAA00633@unready.microstar.com>
Message-ID: <3.0.1.16.19980225170648.0947fd3e@pop3.demon.co.uk>

At 16:10 25/02/98 +0000, Juergen Modre wrote:
[...]
>- setValidate(true);
>  That the document/stream should also be validated during parsing.
>
>The question where there is exactly the border between well-formedness
>parsing and validation parsing should be left to the parser. This border
>can be found in the XML spec.
>The SAX interface is/should be useable for both classes of XML parsers
>and give also the possibility to enable/disable validation.
>
>
>But I agree that it is sometimes not easy to see the clear border
>between well-formedness parsing and validation parsing in the XML spec.
>
This is an area that I (and I think others) have difficulty with, although
I think there are many who are clear how different parsers behave. This
also interacts with the 'standalone' value in the xml PI. There is also
some potential confusion as to when and how the presence/absence of the
external subset makes a difference.

If my worries are unfounded, then it should be possible to create a precise
description of what parameters, files, internal subsets etc. and need to
control the behaviour of a SAX-compliant parser and what it should do. In
which case it would be very helpful to see it set out clearly and I'll shut
up.  If, however, there still is confusion then we shall discover it in
these attempts :-)

P.

Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic
net connection
VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary
http://www.venus.co.uk/vhg

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From peter at ursus.demon.co.uk  Wed Feb 25 19:28:20 1998
From: peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun  7 17:00:13 2004
Subject: The XML spec in XML: missing tags
In-Reply-To: <3.0.5.32.19980225104757.00a13120@village.doctools.com>
References: <98Feb25.042025est.18818@thicket.arbortext.com>
 <3.0.5.32.19980224192743.00a0d220@village.doctools.com>
 <98Feb23.120356est.18826@thicket.arbortext.com>
Message-ID: <3.0.1.16.19980225190358.0b6f0f44@pop3.demon.co.uk>

At 10:47 25/02/98 -0500, Eve L. Maler wrote:
>Oh, you want *documentation*, do you??  Well, the DTD was hard to write; it
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Not ME. I was weaned on 5-hole paper tape. Variables should be no longer
than 1 character. 

>should be hard to understand. :-)

Yes. I strip comments from FORTRAN programs as it is good for the soul and
saves cards.

I must have dreamed it, but someone posted a month or two back that
documentation was a *required* part of a DTD :-)


>
>Seriously, I keep saying that I'll release the reference documentation Real
>Soon Now, and in fact I'm hoping to be able to spend a few hours tidying it
>up and releasing it later this week.  (There's also a minor DTD update in
>the pipe.)

Great. Seriously - although it wasn't perhaps intended, rec.xml is a
splendid vehicle for people to cut their teeth on - it's got structure,
uses normalisation, has a good variety of elementTypes but also uses some
in a generic manner. The only thing it doesn't use is entities. I have
tweaked my SAXSplit jiffy to do produce entities for div1, etc.

And - an argument for preserving comments in document structure - there is
some splendid archaeology inside...
>
[...]
>
><termdef> is a really odd way to do term definitions, for my money, but
>that's what the users wanted. :-)  It captures an "inline" definition of a

*Users*?? DTD by committee?? gulp.

>term, and because of the mixed content model, it can't even ensure that a
><term> is present to identify the actual term being defined.  Likewise, it
>can't ensure that the definition captured functions as a "standalone"
>sentence or set of sentences.  I suspect that the cut-off sentence was more
>in the spirit of poetic license.

Fair enough. The approach I am taking to terminology is based on MARTIF
(ISO12200 and ISO12620) - MARTIF itself having strong TEI roots. So I shall
use some simple heuristics to transform termdefs to my termEntry's
>
><term> is occasionally used legitimately without a <termdef> wrapper; it's
>marking a term being used in a special way, without an accompanying
>definition.

Yes. I shall abstract these.

>
>Gee, maybe I should just collect all the questions and do the documentation
>as a Q&A...

Not a bad idea. I certainly don't want you to go to a lot of trouble. One
line sentences for each elementType are probably OK, plus any hardcoded
semantics (e.g. what the target of IDREFs may/maynot be. [I have a set of
simple tools in JUMBO that allow you to browse documents, so you find all
elementTypes, their allowed children, attributes, attribute values, etc.
and can then display the actual location in the document. You can then make
a pretty good guess at what they mean.]

	P.


Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic
net connection
VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary
http://www.venus.co.uk/vhg

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at quake.net  Fri Feb 27 00:01:13 1998
From: donpark at quake.net (Don Park)
Date: Mon Jun  7 17:00:13 2004
Subject: JFC 1.1 Released
Message-ID: <000401bd4312$190c22e0$2ee044c6@donpark>

This is a heads-up notice to those of us interested in Java.

JFC 1.1 was released today.  It does not include Java2D nor Drag-n-Drop.
Metal L&F looks good but I was somewhat disappointed by lack of speed
improvements over the beta versions.  There are still some update problems
and some of the features were maimed or shifted into preview status.  It is
better than nothing.

The fact that JFC 1.1 has now been shipped means that JDK 1.2 beta 3 release
is not far behind since JFC 1.1 was supposed to ship at the same time.

I just thought you guys might be interested in the news,

Don Park
http://www.quake.net/~donpark/index.html


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From zwang at pstat.ucsb.edu  Fri Feb 27 02:34:54 1998
From: zwang at pstat.ucsb.edu (Zheng Wang)
Date: Mon Jun  7 17:00:13 2004
Subject: JFC
In-Reply-To: <000201bd3608$c0c7f4d0$2ee044c6@donpark>
Message-ID: <Pine.GSO.3.95.980226182613.2373A-100000@fisher>

I also tried the Swing1.0. It is still not compatible with JDK. 
Does someone work with both JDK and Swing and know how to make them
compatible?

Thanks

Zheng Wang
Department of Statistics and Applied Probability 
University of California, Santa Barbara
E-mail: zwang@pstat.ucsb.edu; http://www.pstat.ucsb.edu/~zwang


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jjc at jclark.com  Fri Feb 27 03:29:11 1998
From: jjc at jclark.com (James Clark)
Date: Mon Jun  7 17:00:13 2004
Subject: SAX: finalising org.sax.xml.Parser
References: <01bd4206$6ccb6d30$a00b0ac0@pharcyde.poetsoftware.xo.com> <199802251724.MAA02583@unready.microstar.com>
Message-ID: <34F6321E.8D415644@jclark.com>

David Megginson wrote:
> 
> Matthew Gertner writes:
> 
>  > How about a 2x2 matrix?
>  >
>  > With DTD
>  >     setValidate(false) - checks for well-formedness, external subset is used
>  > for entity and notation declarations, etc.
>  >     setValidate(true) - full validation
>  >
>  > Without DTD
>  >     setValidate(false) - just checks for well-formedness
>  >     setValidate(true) - throws an exception
> 
> This comes back to the original problem, however: what if I want to
> include the external subset and external text entities but don't want
> to validate?  I'm not sure that the two should be tied together
> (AElfred, for example, does not validate, but it does use the DTD).

The following seem the reasonable combinations to me:

- Validate and process all external entities (if you're validating
you've got to process all external entities).

- Don't validate and process external DTD and parameter entitities
depending on the setting of standalone.

- Don't validate and process external DTD and parameter entities
(irrespective of the setting of standalone).

- Don't validate and don't process external DTD and parameter entities
(irrespective of the setting of standalone).

James

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From moroz at paragraph.com  Fri Feb 27 13:32:42 1998
From: moroz at paragraph.com (Moroz, Oleg)
Date: Mon Jun  7 17:00:13 2004
Subject: JFC
Message-ID: <00FE2F436493D111900E00A0C91003780C7C2F@ms.paragraph.com>

Zheng Wang[SMTP:zwang@pstat.ucsb.edu] wrote:

> I also tried the Swing1.0. It is still not compatible with JDK. 
> Does someone work with both JDK and Swing and know how to make them
> compatible?

What do you mean by "not compatible with JDK" ? Swing 1.0 works perfectly
with JDK / JRE 1.1.5 for Win32 from Sun and I hope with the latest JDK 1.1.5
for Linux from Steve Byrne (will try that at home tonight). It also works
with the latest Microsoft JVM (from IE 4.01), although not so perfect
(tooltips don't show text and some examples produce spurious exception stack
backtraces, but continue operating).

Oleg


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From dima at paragraph.com  Fri Feb 27 17:26:02 1998
From: dima at paragraph.com (Dmitri Kondratiev)
Date: Mon Jun  7 17:00:13 2004
Subject: ANN: XLogo - programming with XML Logo Turtle Graphics 
Message-ID: <2.2.32.19980227172605.00916750@dream.paragraph.com>


XLogo Announcement
------------------

XLogo is a markup language I wrote to program Logo Turtle Graphics with XML
in Java applet. XLogo program is a well-formed and valid XML document. XLogo
runtime is a set of Java classes that process XLogo program. 

The main reason for XLogo was to find out the advantages that XML provides
for developing problem domain specific meta languages. Another goal was to
learn XML and experiment with SAX - Simple API for XML. 

To find more about XLogo check:

http://www.geocities.com/SiliconValley/Lakes/3767/xlogo-index.html

Any comments and ideas are most welcome !

Thanks,
Dima


---------------------------
dima@paragraph.com
102401.2457@compuserve.com
http://www.geocities.com/SiliconValley/Lakes/3767/
tel: 07-095-464-9241


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ak117 at freenet.carleton.ca  Sat Feb 28 03:24:03 1998
From: ak117 at freenet.carleton.ca (David Megginson)
Date: Mon Jun  7 17:00:13 2004
Subject: SAX: org.xml.sax.AttributeMap
In-Reply-To: <34F38506.3D19B68A@jclark.com>
References: <199802250126.UAA00473@unready.microstar.com>
	<199802250138.UAA00524@unready.microstar.com>
	<34F38506.3D19B68A@jclark.com>
Message-ID: <199802280322.WAA00888@unready.microstar.com>

James Clark writes:

 > I agree that SAX ought to provide access to unparsed entities but I
 > don't think this is the right way to achieve it.  For a start, I can
 > have an ENTITIES attribute, so all these methods would need two
 > arguments (the index of the attribute in the attribute list, and the
 > index of the token in the value).

An excellent point, and one that I missed in the original SAX.

 > I think a better approach is for the processor at the end of the prolog
 > to pass an object to the application that provides information about all
 > the declared notations and unparsed entities.
 >
 > XP has a DTD object that does this, but it might be better to call it
 > something else (like UnparsedEntitySet) since SAX might someday be
 > extended to provide full DTD access.

This is a good idea, but I need to find a way to avoid using the
Java-specific Enumeration class that your example uses (since I've
already eliminated it from AttributeList).


All the best,


David

-- 
David Megginson                 ak117@freenet.carleton.ca
Microstar Software Ltd.         dmeggins@microstar.com
      http://home.sprynet.com/sprynet/dmeggins/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ak117 at freenet.carleton.ca  Sat Feb 28 12:29:18 1998
From: ak117 at freenet.carleton.ca (David Megginson)
Date: Mon Jun  7 17:00:13 2004
Subject: SAX: Sorting out org.xml.sax.AttributeList
Message-ID: <199802281227.HAA00658@unready.microstar.com>

I have been working very hard to keep the number of interfaces in SAX
to a minimum, but it looks like there will be no way to avoid adding a
couple of additional ones if SAX is going to support unparsed entities
(as, I think, it must).

James's suggestion of using indexed properties instead of a lookup-map
is a very good, light-weight one.  If attributes, entities, and
notations are all indexed, then they will share a certain amount of
common functionality which should be split out into its own
interface:

  package org.xml.sax;

  public interface NameList {
    public abstract int getLength ();
    public abstract int getIndex (String name);
    public abstract String getName (int index);
  }

This is very JavaBean-like, except that getName does not throw an
ArrayIndexOutOfBounds exception (it just returns null for an invalid
index, and getIndex() returns -1 for a name that is not present).

Next, attribute lists extend this interface to add value and type:

  package org.xml.sax;

  public interface AttributeList extends NameList {
    public abstract String getType (int index);
    public abstract String getValue (int index);
  }

For notations, we need external identifiers instead:

  package org.xml.sax;

  public interface NotationList extends NameList {
    public abstract String getSystemId (int index);
    public abstract String getPublicId (int index);
  }

Unparsed entities are identical to notations, but they also need the
name of the associate notation:

  package org.xml.sax;

  public interface UnparsedEntityList extends NotationList {
    public abstract String getNotationName (int index);
  }

>From a purist point-of-view, UnparsedEntityList and NotationList
should both extend a common ancestor, like ExternalObjectList, but I
am becoming very concerned at the number of interfaces multiplying
here.

The application will gain access to these lists through a DTD
callback in org.xml.sax.DocumentHandler:

  public void dtd (UnparsedEntityList entityList, 
                   NotationList notationList)
    throws java.lang.Exception;

Should this event always be fired, or should it be fired only if there
actually is a DTD?


How does this sound to everyone?  For me, there are pros and cons:

PROS
----

1) This arrangement is _much_ simpler to understand than the old
   org.xml.sax.AttributeMap.  Most users can deal only with
   AttributeList (which is now trivial), and they can ignore
   NotationList and UnparsedEntityList unless they need to use
   unparsed entities. 

2) It is possible to look up a notation or entity directly by name,
   even if the name appears in a CDATA entity or in character data
   content.


CONS
----

1) Too many interfaces.

2) Users will complain that the dtd() callback does not return other
   information, such as lists of declared elements.

3) It may turn out that XML implementors shun unparsed entities and
   notations in favour of HREF's and MIME types, in which case we will
   have added this complexity to SAX for nothing.


Thanks,


David

-- 
David Megginson                 ak117@freenet.carleton.ca
Microstar Software Ltd.         dmeggins@microstar.com
      http://home.sprynet.com/sprynet/dmeggins/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)