Word DOC to XML Converter

Tony Stewart tony.stewart at rivcom.com
Wed Jan 13 23:01:16 GMT 1999


Nikita:

Sorry, this trick doesn't quite work. Depending on the document you'll need
to do a bunch of manual cleanup or write a script to take care of it. (Among
other things, the SIZE attribute values are all unquoted.) OTOH "Save as
HTML" does get you a good way down the road and gives you something you can
work with. Whether the result is useful XML or not is another question.

Regards,

Tony
tony.stewart at rivcom.com <mailto:tony.stewart at rivcom.com>  

-----Original Message-----
From:	Ogievetsky, Nikita [mailto:nikita.ogievetsky at csfb.com]
Sent:	Wednesday, January 13, 1999 12:53 PM
To:	'xml-dev at ic.ac.uk'
Subject:	RE: Word DOC to XML Converter

>Andreas Berg wrote:
> I am searching for a converter from Word documents to XML. Unfortunatly >I
have
> no time to wait for Office 2000..... Is there something like this
available?

In the MS Word go to <File>/<Save As> menu, select "Save as HTML document".
It will create a well formed XML file: HTML with all elements having start
and end tags. Apply XSL if you want tag name or attributes changed.

(Just remember to exhume the <body> - sorry for bad joke).

Nikita Ogievetsky.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list