Scripting and XML

Rasmus Lerdorf rasmus at
Sun Oct 19 05:51:14 BST 1997

> XML at this point seems to be exceptionally well written for a model in which 
> the data is passive and gets processed by an outside application - the 
> parser/application combination.  It doesn't seem like it will work very well, 
> however, with a model that is rapidly growing more popular in the HTML world: 
> scripts included in the same document as the data.  While this blending of 
> data and processor is admittedly a little unusual, it is becoming standard 
> practice more and more often.  The W3C's Document Object Model proposals 
> explicitly include XML, leading me to plot out the development of programs 
> that take advantage of this powerful new tool.  (Or, at least it will be a 
> powerful new tools when they figure out what it should look like and someone 
> implements it.)
> The rather gigantic problem I'm having is that scripting languages, including 
> ECMAScript/JavaScript, use all kinds of markup characters.  In their context, 
> a < character just means "less than".  I suppose I can use 
> <![CDATA[...script...]]> inside the SCRIPT tags and hope that the vendors 
> implement this properly, but it would be a heck of a lot easier to be able to 
> declare <!ELEMENT SCRIPT CDATA>.  I never thought I'd complain about the SGML 
> goodies that got dropped to make XML intelligible to ordinary humans and 
> parsers, but it's happened. This seems like an easy thing to fix, and 
> something that would bring XML more in line with other W3C projects. 
> Any thoughts?

As an author of just such a language, this has been a concern of mine ever
since I first read the XML proposal.  I posted to this list last week
about this as well.

The language I wrote is called PHP/FI (see  It is
currently undergoing a complete rewrite, and making sure that I don't
clash with XML is a priority.

My solution, naiive as it might be, is to make hide my language inside a
PI tag. The XML syntax definition for this tag is:

      PI ::= '<?' Name S (Char* - (Char* '?>' Char*)) '?>'

This, to me, says that I don't have to worry about a single '>' nor a '<'
inside the tag.  It is only a '?>' that could cause me some problems.

So, a typical bit of code would look like:

    $result=mysql("db","select passwd from users where id='$cookie'");
    if(mysql_result($result,"passwd")==crypt($input)) {
        echo "Welcome $id<BR>\n";

Now, my language is a server-parsed language, much like Microsoft's ASP
and NetScape's LiveWire or server-side JavaScript.  That means that the
actual browsers out there will never see these tags.  However, living at
peace with XML is still important because of XML authoring tools.  I would
like people to be able to create XML with an XML authoring tool that
includes my PHP script tags.  

This is obviously a hack.  Just thought it might be informative for you to
hear the sort of nasty mutilations that you are going to encounter when
XML gets into the hands of ordinary developers who know next to nothing
about SGML/XML.  

If the XML spec could address this issue of embedding scripting languages
directly, and provide some guidelines for the developers of these
scripting languages, then it would certainly make life easier on everyone.
Like it or not, these scripting languages are here, and they are not going
to go away.  If it is clearly laid out how such a scripting language
should co-exist with XML, the amount of future incompatibilities and
confusion might be reduced.

By the way, a quick look at Microsoft's ASP will reveal that they use 
<% ... %> tags.  How that is going to survive an XML parser, I have no


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at
Archived as:
To (un)subscribe, mailto:majordomo at the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at

More information about the Xml-dev mailing list