A little wish for short end tags (Was: RE: SDD bogus)

Steven DeRose sjd at eps.inso.com
Fri May 8 23:58:46 BST 1998


At 10:38 PM 05/08/98 +0200, Jarle Stabell wrote:
>2. The method <MethodName>M1</> of the fantastic <ClassName>Class1</> can 
>be used in situation <Situation>X</>.
>
>I think variant 2 is faster to read than variant 1, and you don't have to 
>check the end-tags for misspellings.

True, you don't have to check them; but the often forgotten corrolary is
that you also *can't* check the end-tag for misspellings if you go that route.

So if the data is erroneous you are far less likely to detect it *at all*,
making for truly nasty debugging. This is an ancient information-theoretic
tradeoff: you can always save space, but the more you save, the less chance
you have of detecting errors. This is because when you reduce redundancy,
you increase the % of all possible bit sequences that are syntactically
correct.

For example, imagine trying to communicate in a noisy room if every
possible sequence of sounds was a legitimate English word. Or imagine
programming in a language where every possible byte sequence is a
syntactically correct program (APL and raw machine code are the only
approximations I can think of to that -- guess why).

>
>The argument that compressing reduces/eliminates the size advantage of 
>documents with empty end tags often doesn't apply, the document will often 
>be stored uncompressed on users hard-disks, in databases and in memory.

Sorry, but with Win98 rumored to demand 64MB of RAM just to run and with
Moore's Law applying to memory prices, I can't muster much enthusiasm for
an argument that it is too costly to shave bytes on markup. If you had to
put *ten* full tags on every element you'd hardly ever notice any impact
except on a 747 manual, and anything that big can't be handled practically
in raw unparsed form anyway. I did a lot of statistics on this a few years
ago: a fully-marked-up file with no minimization is still wayyyyy smaller
than the equivalent word processor file in typical systems, so what's the
big deal?

I agree it would be handy when typing XML by hand or reading it raw. But it
is not without adverse consequences too. I'd rather see better editing
tools so I don't even have to know about such details.


                     Steven J. DeRose, Ph.D.

Visiting Chief Scientist, STG   |    Chief Scientist
Adjunct Associate Professor     |    Inso Corp. EPS
Steven_DeRose at brown.edu         |    sjd at eps.inso.com
401-863-3690                    |    401-752-4438

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list