Re WF, V, and MSXML
Peter at ursus.demon.co.uk
Mon Jun 9 11:30:07 BST 1997
In message <199706082339.QAA08654 at bolt.sonic.net> Terry Allen writes:
> Peter Murray-Rust replying to me to him etc.
[... and hoping the WG/ERB are reading this ...]
> Right. That's why the IETF assigns such importance to running code.
Good point. That is why XML-DEV is important and why we need people to
create prototypes at this stage. [Most XML-related software and documents
come into this category because the problems we are encountering may have
implications on the language.]
> | I think this is more a question of terminology. NXP (Norbert Mikula) is a
> | 'validating parser', but the validation can be switched off. This is a
> | client-side decision. So with NXP 'palmy' could be either invalid or WF
> | according to the reader's wishes
> Agreed, but from the viewpoint of the document preparer, it is both. MSXML
> needs the switch NXP has. I think the behavior is unintentional, but
> I would be alarmed at a processor/parser (they mean the same to me in
> this context) that attempted to parse for validity, and if it found
> an error, silently switched to WF-parse mode.
I'd agree with this analysis, and haven't been silent on the issue. IMO it
is more important for the WG/ERB to address *this* problem than some of the
proposed extensions. The concept of WFness is NEW!! It is more subtle than
people realise. A fundamental problem is that there is no clear internal
flag in the document stating what the validity/WFness of the current document
is, is meant to be, was, etc. As Terry says, it's particularly likely that
a WF document could (possibly erroneously) mutate into a valid one. I am
sure that any confusion about MSXML is not intentional and is due to the issue
not be prominent in the spec.
All parsers (i.e. tools that take XML documents and apply the criteria in
XML-LANG only) should state their attitude and behaviour to WFness and validity.
The possible options include at least:
- nsgmls-like. Full validation is the only option. Any non-valid
dcoument is flagged and appropriate error messages or error
action is initiated.
- Lark-like (at least V0.88 - I think there is another coming). No
validation can be attempted. Any 'output' can only be WF or
in error. NOTE: what does Lark do with the internal subset?
- NXP-like. Validation can be switched on or off by the 'client'.
How this is transmitted to the application is application
dependent at present.
- MSXML-like. Undocumented at present. Possibly [though Terry and I
hope not] validating by default, and changing to WF if this
> Point taken; but the spec is not entirely clean on this point. If the
> application requests the processor to process, the processor must
> inform the application of certain things. And it is hard to get
> "*An XML processor which does not read the DTD must always pass all
> characters in a document that are not markup through to the application.*
Ah! I had assumed the internal subset as 'markup' - you see it as part
of the document. We need a ruling on this :-). Obviously if the DTD appears
***in the processed document***, then it could be interpreted as having been
read and used for validation.
> | what is the implied structure of the document in:
> | <!DOCTYPE FOO [
> | <!ATTLIST FOO XML-LINK CDATA #FIXED "SIMPLE">
> | ]>
> | <FOO HREF="bar"/>
> | Can we assume that FOO (which has no Element declaration) has an ATTLIST as
> | given, and that therefore it inherits the SHOW and ACTUATE attributes?
> | IOW *must* a parser decorate all matching elements with the ATTLISTS in the
> | internal subset?
> No, not per XMLlang alone. FOO's only declared attribute has as its name
My mistake. I shouldn't have brought the others in.
> the unreserved string "XML-LINK" although it uses an undeclared attribute
> name "HREF". So it is WF but not valid.
Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo at ic.ac.uk the following message;
List coordinator, Henry Rzepa (rzepa at ic.ac.uk)
More information about the Xml-dev