Whitespace

Sean Mc Grath digitome at iol.ie
Tue Aug 26 22:07:06 BST 1997


>At 05:45 PM 26/08/97 +0100, Sean Mc Grath wrote:
>>It is easy to see what has happened here. The s/w developers have
>>a pattern for matching AREA elements that does not countenance the presence
>>of a CRLF.

[Tim Bray]
>Gimme a break; the software developers in this case have screwed up;
>there is a technical term to describe this behavior: "wrong".  There may
>in fact be productive things to be said about particular application
>profiles for whitespace handing, but this example is a complete
>red herring. 
>

I presented this "red herring" because it was *real*. I could have
contrived a more realistic one:-) This is an
example of a *real* programmer screwing up in a real application.

I am interested in avoiding screwups. WS is a screwup "happy hunting
ground" for us normal programmers who make mistakes day in day out.

At least I think it is. Perhaps (hopefully) I'm wrong.

I doubt if I will get this right but I will try and formulate the programming
problem as I see it. 

Here goes:-

XML processing applications that read/write XML have to faithfully
reproduce white space to avoid data loss. In the course of XML processing,
actions will regularly be triggered by context. I.e. "element X within
element Y",
"first data content chunk below element X" etc.

Take a really simple context, "X followed by Y". In order to faithfully
reproduce 
WS on output the simple pattern "XY" must be transformed into (in rusty Perl)

"(w*)X(w*)Y(w*)"

Where "w" represents the pattern for White Space.

As the state spaces get more complex (i.e. realistic) doesn't this problem
escalate?

Could someone out there who reckons this is easy kindly put
me out of my misery by showing how it can be best handled?



Sean Mc Grath

sean at digitome.com
Digitome Electronic Publishing
http://www.digitome.com


xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo at ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa at ic.ac.uk)




More information about the Xml-dev mailing list