5 Things You Can Do With DocumentAlchemy

1. Upgrade Your Old *.DOC Files to *.DOCX.

Beginning with Office 2007, Microsoft changed the format for Word documents from the proprietary .doc binary format to a standards-based .docx XML format.

(As MIME types, .doc files are application/msword while .docx files are application/vnd.openxmlformats-officedocument.wordprocessingml.document.)

A similar change was made to the structure of Excel (.xls to .xslx) and PowerPoint (.ppt to .pptx) documents.

These changes make Microsoft Office documents more inter-operable with other productivity suites (such as OpenOffice) and make it easier for third-party tools—including DocumentAlchemy—to parse and manipulate Office files.

Modern versions of Microsoft Word will, of course, read both .doc and .docx documents. But some third-party tools will cannot handle the .doc format, so you may need to convert files from .doc to .docx before using those tools.

Since DocumentAlchemy understands both the .doc and .docx formats, you can use the API to “upgrade” files from the old (binary) format to the new (zipped-XML) format.

To convert a single file .doc file into a .docx file

, you can use the following curl command:

where 403l1zh3dkbakyb9 is the value of your DocumentAlchemy API key, MY_DOCUMENT.doc is the location of your “input” file and "MY_DOCUMENT.docx" is the name and location of your newly created “output” file.

To convert many .doc files into .docx files in batch-mode

, you can use a shell-script like the following:

Save the script as doc2docx.sh, make sure it is executable (chmod a+x doc2docx.sh) and run it like this:

./doc2docx.sh MyDocuments/*.doc

For every .doc file enumerated on the command line, a sibling .docx file will be created (or over-written!) containing the DOCX equivalent.

Note that the same general approach will work for converting .xls to .xlsx and .ppt to .pptx. Just change the format of the target rendition as found at the end of the URL in line 47 of the shell-script.

EDIT: See “Upgrade your old .doc files to .docx.” in “Five Things You Can Do With the DocumentAlchemy API - Command-Line Interface Edition” for an even easier interface to this functionality.

