FW: Namespaces and DTDs
Marc.McDonald at Design-Intelligence.com
Marc.McDonald at Design-Intelligence.com
Sat Mar 13 02:11:47 GMT 1999
It's quite true that you can have XML that does not require validation
and that this is commonly done. An exception is the defaulting of the
value of any attributes of elements in a DTD, which has been mentioned
in another reply.
You can construct a DOM without validation, but the next step ends up
being a procedural implementation of picking apart the DOM document
tree to construct whatever structure the application using DOM
requires to interpret the document.
I can parse:
<book title="tale of 2 cities">
<chapter>
<para>..<para>
</chapter>
<chapter>
...
</chapter>
...
</book>
without a DTD.
But if my application needs to get the information out of the DOM I
need to write code to:
Create a representation for Book consisting of a title and chapters
and get book from DOM
Create a representation for each Chapter and get Chapters from DOM
Create a representation for each paragraph in a chapter and get
paragraphs from DOM.
Part of this is what is expressed in the DTD. Wouldn't it be better if
a system were created that used the DTD on the receiving end to create
the application representation instead of serializing it back into
elements and constructing a new tree?
Marc B McDonald
Principal Software Scientist
Design Intelligence, Inc
www.design-intelligence.com
----------
From: Didier PH Martin [SMTP:martind at netfolder.com]
Sent: Friday, March 12, 1999 5:20 PM
To: Marc McDonald; cbullard at hiwaay.net
Cc: xml-dev at ic.ac.uk
Subject: RE: FW: Namespaces and DTDs
Hi Marc,
<YourComment>
Actually there is another representation of the information in the
DTD
that is present: the application that uses the document.
Unfortunately
the representation is in C++, Java or some other language. This
introduces a synchronization problem between the two.
The DOM api for instance gives you access to the parsed document
tree,
but a sizable amount of independent code must be written to
essentially parse the DOM tree into the form the application needs.
The result is the structure is in 2 different forms, declarative and
procedural, which must be kept in sync.
</YourComment>
<Reply>
You are right. but I can construct a DOM without any validation. The
whole
point here is: if I need validation at the receiving end why not use
SGML
which is more elaborate and necessarily need validation (because of
the
possibility to have omittags). If however, we do not need validation
at the
receiving end then, we are better to use XML that, because of its
structure,
can be parsed without validation and then a DOM could be created for
procedural language consumption.
But you are right to say that from the serialized format I have to
construct
a model (i.e. a structure) that interpreters can access. The DOM is
the XML
way to do it and the grove for the SGML way (DOM and grove concept
are
similar enough to reduce one to the other)
to become useful XML life cycle could be expressed like:
a) XML format creation: we need a DTD, so that the editor can validate
the
document or simply prevent me to create an invalid document.
b) transport
c) receiving end: interpretation. The interpreter needs a parser. A
validating parser is not necesssary with XML, It seems that we have
several
kinds of parsers:
1- event driven
2- function call within a loop
3- DOM producer
d) The interpreter knows the semantic and do something.
In fact, XML rules do not convey semantics only syntax. Xpointers or
Xlinks
are domain specific languages that add a semantic layer to XML. XHTML
also.
In fact, all these concept where existing in the SGML world. Waht we
gained
with XML compared to SGML is simplier parsing rule. So simple that
validation is no longer necessary to do a complete parsing operation.
The
SGML syntax is more tricky because you need to tell the parser that
some
markups are not with an end tag, thus, the need for a DTD which has
the main
function to tell the parser some parsing rules like where a tag begin
and
end. So, because of the "well formed" constraint we gained that now
parser
do not a DTD to accomplish their task, the rule is clear on how a
markup
begin and ends.
My conclusion:
we gained with XML the fact that a parser do not need to do
validation.
Otherwise its only changing the XML extension to a sgml document. So,
to go
from "mydocument.sgml" to "mydocument.xml" whitout really changing
anything
except some minor modifications in the DTD declaration. That may be
good for
marketing reasons but surely not for technical reasons.
</Reply>
Regards
Didier PH Martin
mailto:martind at netfolder.com
http://www.netfolder.com
xml-dev: A list for W3C XML Developers. To post,
mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)
More information about the Xml-dev
mailing list