EMBED and validation

Peter Murray-Rust peter at ursus.demon.co.uk
Sun Nov 30 00:04:18 GMT 1997

At 17:02 29/11/97 -0500, David G. Durand wrote:
>At 4:46 PM -0000 11/29/97, Peter Murray-Rust wrote:
>>The only area of fuzziness is what the default and optional behaviours of a
>>parser (sic) are. If I write:
>I don't think there is any fuzziness at all.

Well, please pardon my slowness and be patient - it has taken me a long
time to get this far with SGML.  The spec repeatedly uses the word 'may',
which I take to be optional behaviour (e.g. 4.3.3 'may, but need not,
include the entity's replacement text.'  I expect that some parsers may
allow the user to decide, some may take unilateral action. Perhaps
'fuzziness' was the wrong word - a 'variety of options with which the user
may be confronted' might be more accurate. Other actions which a parser
'may' take could include:
	- whether to read the external DTD subset
	- whether to read the internal subset
	- whether to validate
	- whether to expand the external entities or not
Some of these may be defined clearly in the new spec, some may not. It may
be that most parsers end up with a list of commmandline options like sgmls. 

>>be a validation error (MathML uses a different DTD from HTML). If the
>>entity is valid, then it creates a 'single document' which is easy to
>>search, etc. One disadvantage is that (for Java) the document could get too
>>big for the JVM.
>If the MathML elements are not declared in the DTD, _no_ validating parser
>can ever accept this as legal.

Fair enough - what I wrote was incorrect :-) Sorry.
>>	- offer a commandline switch that allows inclusion of external
>>entities OR
>>defers their expansion to the application/processor. In that case the
>>*application* has to be able to able to run a parser over the 'included'
>No, external entities are parsed in place. WF-only applications might not
>follow the entities (under user choice, whether interactive or
>command-line), or they might folliow them and present the information.
>Parsing relative to a different DTD would be unfortunate behavior, since
>validation should be done according to the rules of XML.
>Of course, a WF application might jsut swallow the elements and use its own
>stylesheet language to format some math.

Understood. Thanks.

>>(JUMBO can do this at present - it can even use a different parser from the
>>initial one, which may be useful if they have different behaviours).
>You mean if they have bugs?

No. They may deliberately have different behaviours. Some may be very good
at handling large documents, others may be validating and possibly slower.
Some may offer more information as a result of the parse.


Thanks for the help - I keep learning :-)

Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic
net connection
VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)

More information about the Xml-dev mailing list