HOWTO Read .doc Files in Console

We all receive .doc files attached to our emails. Sometimes we just have to read them, no matter how much we despise documents written with MS Word, or people who use MS Word.

There are several alternative solutions for the task. One, and probably the easiest solution, is to use OpenOffice.org Writer for reading documents produced with MS Word. A lighter solution would be to use Abiword. If you are using KDE, you could try Kword for opening the file. All of the mentioned word processors can pretty well import documents in .doc format.



Very often, all I need from a document is its textual information. It does not matter for me what is the used font and how well or poorly the document has been formatted. In this case, I usually do not bother to open a word processor just to read a document. All I need is a command line application, antiword. Antiword has been ported to a wide selection of operating systems ranging from DOS to Amiga. So it does not surprise at all, that it is available also for Linux. Use your distributions package management application for installing it.

Antiword is able to convert Word documents to plain text, to PostScript, to PDF and to XML/DocBook. My needs are more modest, plain text is all I need.

Antiword is a command line tool. Thus, all I need to convert this text from antiword.doc to plain text and read it through a pager is:

antiword antiword.doc | less




I can't imagine a simpler solution to this problem caused by the widespread use of proprietary binary file formats. Unfortunately the latest version of Antiword dates from 2005. It does not yet convert files written in the latest 2007 incarnation of the binary bloat.

1 comment:

Unknown said...

Antiword is great! I also like the combination of antiword and mc (the console based file utility tool). Find your document, press F3, and your document appears. Can't get much faster than that, and you don't even need to type!

For RTF files, you might try catdoc, and openoffice documents can be read with o3read. All are available in Debian (and probably Ubuntu). Commandline is great :-)