Re WF, V, and MSXML
Peter Murray-Rust
Peter at ursus.demon.co.uk
Mon Jun 9 11:30:07 BST 1997
In message <199706082339.QAA08654 at bolt.sonic.net> Terry Allen writes:
>
> Peter Murray-Rust replying to me to him etc.
[... and hoping the WG/ERB are reading this ...]
> [Terry:]
[...]
>
> Right. That's why the IETF assigns such importance to running code.
Good point. That is why XML-DEV is important and why we need people to
create prototypes at this stage. [Most XML-related software and documents
come into this category because the problems we are encountering may have
implications on the language.]
[...]
> |
> | I think this is more a question of terminology. NXP (Norbert Mikula) is a
> | 'validating parser', but the validation can be switched off. This is a
> | client-side decision. So with NXP 'palmy' could be either invalid or WF
> | according to the reader's wishes
>
> Agreed, but from the viewpoint of the document preparer, it is both. MSXML
> needs the switch NXP has. I think the behavior is unintentional, but
> I would be alarmed at a processor/parser (they mean the same to me in
> this context) that attempted to parse for validity, and if it found
> an error, silently switched to WF-parse mode.
I'd agree with this analysis, and haven't been silent on the issue. IMO it
is more important for the WG/ERB to address *this* problem than some of the
proposed extensions. The concept of WFness is NEW!! It is more subtle than
people realise. A fundamental problem is that there is no clear internal
flag in the document stating what the validity/WFness of the current document
is, is meant to be, was, etc. As Terry says, it's particularly likely that
a WF document could (possibly erroneously) mutate into a valid one. I am
sure that any confusion about MSXML is not intentional and is due to the issue
not be prominent in the spec.
<PROPOSAL>
All parsers (i.e. tools that take XML documents and apply the criteria in
XML-LANG only) should state their attitude and behaviour to WFness and validity.
</PROPOSAL>
The possible options include at least:
- nsgmls-like. Full validation is the only option. Any non-valid
dcoument is flagged and appropriate error messages or error
action is initiated.
- Lark-like (at least V0.88 - I think there is another coming). No
validation can be attempted. Any 'output' can only be WF or
in error. NOTE: what does Lark do with the internal subset?
- NXP-like. Validation can be switched on or off by the 'client'.
How this is transmitted to the application is application
dependent at present.
- MSXML-like. Undocumented at present. Possibly [though Terry and I
hope not] validating by default, and changing to WF if this
fails.
>
[...]
> Point taken; but the spec is not entirely clean on this point. If the
> application requests the processor to process, the processor must
> inform the application of certain things. And it is hard to get
> around
>
> "*An XML processor which does not read the DTD must always pass all
> characters in a document that are not markup through to the application.*
Ah! I had assumed the internal subset as 'markup' - you see it as part
of the document. We need a ruling on this :-). Obviously if the DTD appears
***in the processed document***, then it could be interpreted as having been
read and used for validation.
[...]
>
> | what is the implied structure of the document in:
> |
> | <!DOCTYPE FOO [
> | <!ATTLIST FOO XML-LINK CDATA #FIXED "SIMPLE">
> | ]>
> | <FOO HREF="bar"/>
> |
> | Can we assume that FOO (which has no Element declaration) has an ATTLIST as
> | given, and that therefore it inherits the SHOW and ACTUATE attributes?
> | IOW *must* a parser decorate all matching elements with the ATTLISTS in the
> | internal subset?
>
> No, not per XMLlang alone. FOO's only declared attribute has as its name
My mistake. I shouldn't have brought the others in.
> the unreserved string "XML-LINK" although it uses an undeclared attribute
> name "HREF". So it is WF but not valid.
Agreed.
P.
--
Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences
http://www.vsms.nottingham.ac.uk/
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo at ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa at ic.ac.uk)
More information about the Xml-dev
mailing list