Conformance in XML processors
peter at ursus.demon.co.uk
Tue Jan 20 22:27:05 GMT 1998
Thanks Paul - you have put it very clearly and it sounds exactly what I was
At 14:19 18/01/98 -0500, Paul Prescod wrote:
>Peter Murray-Rust wrote:
>> Exactly. And I hope that the community is able to develop them. [I am sure
>> all the functionality is present already in SP, but I confess that as a
>> novice to SGML I didn't find it easy to find my way around when I first
>> looked at it. Treat that as a reflection on me.]
>I believe that James Clark has already done most of this work in his XML
>tokenizer (which is distinct from SP).
Better and better.
>I think that we have different ideas about what normalized will look
>like. This is what you are thinking of:
>> <?xml version="1.0"?> <!-- magic incantation -->
>> <MOL NAME="water" xml:lang="EN">
>> <ATOM ID="O1">O</ATOM>
>> <ATOM ID="H2">H</ATOM>
>> <ATOM ID="H3">H</ATOM>
>> <BOND>O1 H2</BOND>
>> <BOND>O1 H3</BOND>
>This is what I am thinking of:
><MOL NAME="water" xml:lang="EN">
I am happier with yours :-) [You seem to have newlines in some tags and not
others, is this intended?]
>In other words, I am thinking about a subset of XML so simple that it is
>trivial to parse and so annoying that no human being would ever want to
>type it directly except for testing out their "reader". I would
Exactly. Most of the stuff I am concerned about will be generated by tools.
>explicitly disallow the magical incantation to discourage people from
>piping in ordinary XML documents (and thus from thinking that this
>reader is making any attempt to be an XML processor).
>> Essentially such a file is a subset of the ESIS information (no attribute
>> typing, no entities, no notation) and uses no CDATA or entity references.
>> It is my contention that there will be many people (some will be DPHs) who
>> will be quite happy to create XML files no more sophisticated than this and
>> will want *tools* to *operate on* them.
>Right, I don't think that these tools should be constructed except as a
>stopgap. There is no good reason that these tools should not support all
>of XML. When people write these simple XML documents and find that their
>tools will not support more, they will inevitably get confused (just as
>most people do with C++) about exactly what XML *is*.
The only reason - and it's probably not "good" - is that the effort to
create or install a solution is too great for the problem at hand. And it
costs money and time.
>I proposed a processor in Fortran that only accepts the output of a
>normalizer, but I do not think that it should not be billed as an XML
>processor, any more than a Fortran program that accepts ESIS would be
>called an SGML parser. The documentation should says: "This Fortran
>program accepts the output of xmlnorm" and leave it at that. In other
>words, xmlnorm becomes an implicit component in the system.
Yes - I like this. Is your use of 'xmlnorm' fictitious, or is such a beast
emerging from the current tools.
Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic
VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)
More information about the Xml-dev