wv, Microsoft Word 95/97/2000 Import Library
Read INSTALL for install information, basically pretty standard
requirements for a modern unix program plus libwmf to convert
wmf files to something useful.
Theres now a small herd of programs based upon libwv.
1) wvHtml, this is what you are looking for, it converts
word 6,7.8,9 files to html, man wvHtml for its details
2) wvHtml --config wvLaTeX.xml attempt to convert word
documents to latex
3) wvSummary, displays the summary information stream of all ole2 files, i.e excel, powerpoint, visio, access etc etc etc,
ive sent wvSummary as a patch to the file program, so that utility
should in the future have this ability to know what windows application
created the ole2 document.
4) wvSimpleCLX, helps distinguish between fast and full saved
5) wvVersion, outputs the version of the word format which the document
is stored as.
6) There are helper applications in the helper-scripts dir
that allow both lynx and netscape to view word 8 docs easily
7) The gateway dir contains the cgi script that allows the
online demo page that some people wanted for themselves.
The webpage with the online demo facility is at http://www.wvWare.com
8) libwv can be used as a library by third party programs, abiword
uses it as its word importer, and kword may use it in the future.
wvHtml is a sample application for the use of wv, as is abiword
itself. The library (will be some day) documented in the
9) the config file format is (beginning to be) documented in the
Documentation dir, so it may be possible to achieve the conversion
that you desire without writing your own program. It also allows
you to customize the html and latex output to your own tastes.
(Projects for someone to try out with libwv include, could be
a) wvMacroRemove, deletes macros out of word files, for the security
b) wvFastToFull, converts fastsaved docs to full saved, shrinking their
size usually by quite a lot, and also making them safe for security
c) wvCrackProtection, take the wvDecrypt97 code and make a cracker
that uses the rc4 algorithm and other known bit of information to
crack word documents, ala http://www.crak.com