Schemas and Other Crucial XML Questions

David Megginson david at megginson.com
Mon Aug 10 15:44:43 BST 1998


Sam Gentile writes:

 > We mean just the XML data coming into the parser (coming over the
 > wire). I guess we could call it an XML file.

OK, now I understand what you mean -- according to the spec, you're
referring to an 'XML document', though I know that the term will sound
strange to database-oriented people (XML takes a very docu-centric
view of things).

 > Sam Gentile writes:

 > > We have a spec called "XML-Data W3C Note 05 Jan 1998", which
 > > discusses schemas. It is not clear from the document what a
 > > schema is used for or what it's purpose is. Is it for designing
 > > the XML buffer only or is it read by the parser? Is it an
 > > extension to XML? Are they even necessary in basic XML?

XML-Data is a note that was submitted to the W3C by Microsoft and a
couple of partners -- it has no official status (a W3C "Note" means
roughly "here's a neat idea from one of our members").

XML 1.0 DTDs and proposed replacements/enhancements such as
Microsoft's XML-Data and XML-Dev's XSchema perform three distinct
roles:

1. Provide a schema for validating the *logical structure*
   (element/attribute/data) structure of an XML document; as a side
   effect, structural schemas can also provide enough information to
   control a guided XML authoring tool.

2. Declare the entities (internal strings or external objects) that
   make up the *physical structure* of an XML document.

3. Provide default logical content for an XML document (such as
   default values for attributes, though XML-Data goes further).

Some people have argued -- quite convincingly, I think -- that these
roles should be kept separate: they are mixed together right now for
historical compatibility with ISO 8879:1986 DTDs.

It is important to note that the W3C's XML WG will soon begin work on
structural schemas and data typing, and that eventually a new W3C
standard will appear -- until then, the only W3C standard for
structural schemas is XML 1.0 DTDs, and XSchema or XML-Data should be
considered strictly experimental.

 > > Also, we have been hearing rumors of a "short" XML notation. Is
 > > there one?  We have a need to reduce the size of our buffers.

No, there is no such thing.  XML's parent, SGML, included extensive
facilities for markup minimisation and has suffered badly for it,
since SGML tools are far too difficult to write (there is still not a
single Java-based SGML parser, beside probably more than a dozen
Java-based XML parsers).

There are, however, alternatives: for example, you could compile the
XML to a compact binary format for internal storage then decompile it
back to a verbose format for export -- there's no requirement to store
it internally as text.


All the best,


David

-- 
David Megginson                 david at megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list