Internal subset equivalent in new schema proposals?

Ketil Z Malde ketil at
Fri Nov 27 08:50:32 GMT 1998

<david at> writes:

> XML is a metalanguage for defining markup languages: in the markup
> languages that you define with it, you can do any sort of data typing
> you want:

[snip object-oriented XML]

> Your complaint is not that XML does not support data typing, but that
> generic XML parsing tools do not enforce the kind of data typing that
> you need out of the box.  One of the reasons for this is that
> everyone's requirements are different; I might want

>   <name type="city">Kingston</name>

> where the type is enforced to be the name of a city that is currently
> in my database.  Someone else might want

>   <subject type="LC">BS</subject>

> where the contents must be a Library of Congress subject.

I don't know the number of LC subjects, but since this information is
static, I assume you could put it in an attribute where only LC
subjects are legal values.

What could be useful and relatively simple, is a restriction of the
*form* of the data, e.g. forcing the <name type="city"> to contain
only letters and start with a capital, or LC subjects to be two upper
case letters (if that's what they are).  Phone numbers, dates, sort
keys, there are many cases where it would be helpful to have the
parser catch these things, I think.

You could equally well embed this in your tools, and I guess you have
to draw a more or less arbitrary line somewhere between what
information is processing independent, and what is not.

If I haven't seen further, it is by standing in the footprints of giants

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at
Archived as:
To (un)subscribe, mailto:majordomo at the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at

More information about the Xml-dev mailing list