Non-XML documents to XML Converter?

Roger L. Costello costello at mitre.org
Tue May 18 12:51:45 BST 1999


Thanks for all the responses to my message.  I would like to clarify my
original posting and present some thoughts on how this might relate to
XSL.

The documents that I am trying to convert to XML are slash-delimited. A
double slash terminates a "set".  A set is comprised of "fields". 
Here's a simple example:

fruit/apple/red/macintosh//
person/Roger/Boston
/male/123-45-6789//

Here I show two "sets".  The second set extends over two lines.  Each
set is comprised of a number of fields.  The first field in a set
identifies the set type (it is the set identifier).

I would like to convert this into an XML document that looks like this:

<message>
    <message1 setid="fruit">
        <kind>apple</kind>
        <color>red</color>
        <type>macintosh</type>
    </message1>
    <message2 setid="person">
        <name>Roger</name>
        <city>Boston</city>
        <gender>male</gender>
        <ssn>123-45-6789</ssn>
    </message2>
</message> 

The particular syntax here is not really important.  The thing to note
is that for a generic transformation engine to work you need to 

(1) supply it a description of the format of the document to be
transformed.  For this example, such info might be "slash-delimited,
double slash terminated lines".

(2) supply it the transformation rules.  For example, 
         rule: match="fruit" {
               <message+count() setid="fruit">
                   <kind>field(2)</kind>
                   <color>field(3)</color>
                   <type>field(4)</field>
               </message+count()>
         }

(3) and of course you need to supply it the actual document to be
transformed.

Interestingly, while driving in this morning I realized that this is
what an XSL processor does.  The only difference is that an XSL
Processor has (1) hardcoded to use <...> as the delimiter.

I think that it would be interesting to make an XSL Processor more
generic such that you could "plug in" a format description document. 
Thus, the XSL Processor could transform not just XML documents, but any
kind of documents.  Comments?

In any case, I will check out those URLs that people sent to me of
conversion tools.  Happy Tuesday!  /Roger

 
Robert C. Lyons wrote:
> 
> Roger wrote: "Anyone have a tool that converts a document that is formatted in a
> non-XML syntax into XML?"
> 
> Roger,
> 
> XML Convert might be able to convert your non-XML document into XML.
> XML Convert can convert a wide range of flat files into XML.
> It uses a flat file schema to parse and validate the flat file
> and convert it into an XML document.
> 
> You can download XML Convert for free at http://www.unidex.com/download.htm.
> 
> Best regards,
> 
> Bob
> 
> ------
> Bob Lyons
> EC Consultant
> Unidex Inc.
> 1-732-975-9877
> boblyons at unidex.com
> http://www.unidex.com/


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list