1. Upgrade Your Old *.DOC Files to *.DOCX.
Beginning with Office 2007, Microsoft changed the format for Word documents from the proprietary
.doc binary format to a standards-based
.docx XML format.
(As MIME types, .doc files are
.docx files are
A similar change was made to the structure of Excel (
.xslx) and PowerPoint (
These changes make Microsoft Office documents more inter-operable with other productivity suites (such as OpenOffice) and make it easier for third-party tools—including DocumentAlchemy—to parse and manipulate Office files.
Modern versions of Microsoft Word will, of course, read both
.docx documents. But some third-party tools will cannot handle the
.doc format, so you may need to convert files from
.docx before using those tools.
Since DocumentAlchemy understands both the
.docx formats, you can use the API to “upgrade” files from the old (binary) format to the new (zipped-XML) format.
To convert a single file , you can use the following curl command:
.doc file into a
403l1zh3dkbakyb9 is the value of your DocumentAlchemy API key,
MY_DOCUMENT.doc is the location of your “input” file and "MY_DOCUMENT.docx" is the name and location of your newly created “output” file.
To convert many , you can use a shell-script like the following:
.doc files into
.docx files in batch-mode
Save the script as
doc2docx.sh, make sure it is executable (
chmod a+x doc2docx.sh) and run it like this:
.doc file enumerated on the command line, a sibling
.docx file will be created (or over-written!) containing the DOCX equivalent.
Note that the same general approach will work for converting
.pptx. Just change the format of the target rendition as found at the end of the URL in line 47 of the shell-script.
EDIT: See “Upgrade your old .doc files to .docx.” in “Five Things You Can Do With the DocumentAlchemy API - Command-Line Interface Edition” for an even easier interface to this functionality.