ModSAX: Proposed Core Features (heretical?)
Bill la Forge
b.laforge at jxml.com
Mon Mar 15 19:15:00 GMT 1999
From: Simon St.Laurent <simonstl at simonstl.com>
>Basically, he wanted the ability to check the document structure without
>the internal subset, so he could rely on the validation process to make
>certain that documents conformed to an 'official' DTD, without extra junk
>some twerpy developer put in the internal subset to make his own version
>valid if not official.
But even given that an 'official' DTD was used, there is a question as to
WHICH official DTD was used. I see several problems with relying on
an unaugmented SAX parser for validation of data being input to an application:
1. DTD-driven validation is rarely complete enough--there will always be
something critical that the application needs to validate. Fortunately,
SAX supports parse exceptions in all the right places, with full information
available on where in the document the error occurred.
2. If the application is going to depend on the parser for some of the validation
(a real boon to the application programmer), then the application needs
to be informed by the parser as to which DTD or other schema was used.
Having the document specify this information in a PI or by some other means is
not sufficient unless that information is somehow compared to the DTD
actually used by the parser.
3. As mentioned by Simon, allowing an author to change a DTD makes no
sense at all in terms of providing a validation service for the application.
4. When filters are placed between the parser and the application, validation is
best done in the last filter, rather than prior to the transformations performed
by those filters. Validation by the parser in this case may produce clearer
error messages, but validation of the transformed data provides the application
with a greater assurance that its data will be in the expected form.
My belief here is that it is perhaps best to abandon validation by the parser-
kernel and instead use filters which support the validation needs of the
application. Errors so detected may be because of a poorly constructed document,
but may also be due to constraints imposed by a particular application. This
of course raises the question of how the response to these two different types of
errors should differ. I can understand a desire to make such a distinction, but
I have not yet come to appreciate the need to make such a distinction.
Bill
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)
More information about the Xml-dev
mailing list