Lark 0.90 available, with an application

Peter Murray-Rust Peter at ursus.demon.co.uk
Fri Jun 27 10:40:39 BST 1997


In message <3.0.32.19970626170447.00a71490 at pop.intergate.bc.ca> Tim Bray writes:
> Hi - Lark 0.90 is now available at 
>  http://www.textuality.com/Lark
> 
> Differences:
>  - now does entity references in attribute values
>  - does &#X style hex character references
>  - has draconian error handling
>  - the Handler has an element() method to serve as an element factory
>  - lots of bug fixes
>  - it's all in a package, textuality.lark

Great!!  I was waiting for the 'package' to bolt it into JUMBO.
[I'm writing this before I have downloaded it.]

> 
> Doesn't do PE's yet.
> It's now over 40k, sigh.

We can't easily get round this problem.  XML takes a *lot* of code.  I have
found that JUMBO has huge classes (e.g. 100 member functions) for Node, Tree
and TOC.  Trouble is that they all have to be loaded even if only a small
amount of functionality is used - e.g. you have to have mouseDrag(), 
mouseMove() even if the user might not drag the mouse :-)

> 
> For me, the interesting thing is that it now comes with an application
> named XH.  It was bothering me that I was writing but not using the
> software, so I created xh, which reads the XML form of all the docs
> I'm working on (XML-lang, XML-link, MCF, etc etc etc) and generates 
> the HTML.  This used to be done with a mouldy tumerous perl program -
> nothing against perl, but xh is a lot cleaner and nicer.  Also it
> produces valid HTML, which the perl didn't.
> 
> Xh is interesting as it is probably a canonical customer for XAPI
> (why did we lose JAX, I liked it?) - it doesn't use the event stream,

So did I! There are 30K+ references to JAX on the net including jax.org
(where the mouse genome is being explored).  

> it lets the parser build the tree and then just runs around the
> elements and attributes.

Yes.  JUMBO does this by having a generic SGMLNode (named before XML was 
invented) which has default actions for attributes, contents, etc. It has
routines such as process(), toHTML(), toString(), display(Graphics g), etc.
So that reading a DTD-less XML document it can still do something with
it.

> 
> For Xh, I also, after getting it working, realized that I had re-used
> Peter Murray-Rust's trick of just having a .class per element-type
> (Class.forName() and Class.newInstance(), gotta love 'em) - I wonder if
> this is just a coincidence or is this the basic paradigm on which XML 
> software is going to be built?  If so, it might make sense to wire
> a standard class-finder call into XAPI.

I suspected we were quite close to this with ElementFactory. I've been 
slightly reluctant to post JUMBO code for this part because JUMBO has evolved
rather than been planned (it wasn't intended to be graphical to start with :-)

The basic steps are:
	- parse the document into a Tree of Nodes (actually Elements at present)
		This is all that can be done with a DTD-less document.
		If NXP or Lark is given as an argument, JUMBO will use them
		as the parser.  It creates Elements as it encounters them (even
		with Lark - this is historical).
	- if a DTD is given, it downloads a *.class file for that DTD. [This is
		resolved locally at present, but if we agree on catalogs and 
		other naming conventions, then we can resolve it globally.
	- the class file gives a list of ElementTypes ('GI's) for which there
		are *.class files available.  Thus in PLAYDTD.class there are
		references to STAGEDIRNode.class, SPEECHnode.class.  **This does
		NOT have to have a class for each type unless that is seen as
		essential.  The default Node methods are used.
	- if a Node has a GI in the DTD class, it is specifically created.  Thus
		the PLAYDTD.class has code like:

	if (gi.equals("SPEECH")) {
		node = DTD.createSubclassedNode("SPEECH", content, attributes);
	} else {
		node = new Node(content, attributes);
	}

Then the subclassed Nodes have node-specific methods, and display() will show
specific icons, etc.

This is done at, or immediately after, parse time.  So JUMBO will create a 
subclassed Node from a generic Lark element if required.  If this is what 
ElementFactory does, then great!

There are the following performance hits:
	(a) it is slower to parse since the specialised nodes are created
	at that time
	(b) all the specialised code is loaded at parse time even if the user
	doesn't require it.  Since performance is hit by code size, some 
	applications run very slowly.  
So perhaps there needs to ba a lazy creation of specialised Elements??  IOW
everything is generic until it's actually referenced, when it gets a 
specialised Element from the factory.

Maybe I will post the code for PLAYDTD if it would help the process.

	P.

-- 
Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences
http://www.vsms.nottingham.ac.uk/

xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo at ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa at ic.ac.uk)




More information about the Xml-dev mailing list