Update documentation for DOCX support

This commit is contained in:
Kovid Goyal 2013-06-04 16:40:26 +05:30
parent ca36afdfae
commit 09e5454e88
2 changed files with 21 additions and 21 deletions

View File

@ -574,28 +574,27 @@ format, whether input or output are available in the conversion dialog under the
Convert Microsoft Word documents Convert Microsoft Word documents
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|app| does not directly convert .doc/.docx files from Microsoft Word. However, in Word, you can save the document |app| can automatically convert .docx files created by Microsoft Word 2007 and
as HTML and then convert the resulting HTML file with |app|. When saving as HTML, be sure to use the newer. Just add the file to |app| and click convert.
"Save as Web Page, Filtered" option as this will produce clean HTML that will convert well. Note that Word
produces really messy HTML, converting it can take a long time, so be patient. Another alternative is to
use the free OpenOffice. Open your .doc file in OpenOffice and save it in OpenOffice's format .odt. |app| can
directly convert .odt files.
There is a Word macro package that can automate the conversion of Word documents using |app|. It also makes |app| will automatically generate a Table of Contents based on headings if you mark
generating the Table of Contents much simpler. It is called BookCreator and is available for free your headings with the ``Heading 1``, ``Heading 2``, etc. styles in Word. Open
at `mobileread <http://www.mobileread.com/forums/showthread.php?t=28313>`_. the output ebook in the calibre viewer and click the Table of Contents button
to view the generated Table of Contents.
An easy way to generate a Table of Contents when converting a Word document is: Older .doc files
-----------------
1. Mark your Chapters and sub-Chapters in the doc file with one of the MS built-in styles called 'Heading 1', 'Heading 2', ..., 'Heading 6'. 'Heading 1' equates to the HTML tag <h1>, 'Heading 2' to <h2> etc For older .doc files, you can save the document as HTML with Microsoft Word
and then convert the resulting HTML file with |app|. When saving as
2. Save the doc as Webpage-filtered (rather than Webpage) and import the html file into |app| HTML, be sure to use the "Save as Web Page, Filtered" option as this will
produce clean HTML that will convert well. Note that Word produces really messy
3. When you convert in |app| you use what you did in step 1 to set the box called 'Detect chapters at' on the Convert - Structure Detection page. For example: HTML, converting it can take a long time, so be patient. If you have a newer
version of Word available, you can directly save it as docx as well.
* If you mark Chapters with style 'Heading 2' then set the 'Detect chapters at' box to //h:h2 This will give you a proper external metadata TOC in the converted epub.
* A slightly more complex example...if your book has Sections and Chapters and you want a 2-level nested metadata TOC. Mark the doc Sections with style 'Heading 2' and the Chapters with style 'Heading 3'. When you convert set the 'Detect chapters at' box to //h:h2|//h:h3. On the Convert - TOC page set the 'Level 1 TOC' box to //h:h2 and the 'Level 2 TOC' box to //h:h3.
Another alternative is to use the free OpenOffice. Open your .doc file in
OpenOffice and save it in OpenOffice's format .odt. |app| can directly convert
.odt files.
Convert TXT documents Convert TXT documents
~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~

View File

@ -20,7 +20,7 @@ What formats does |app| support conversion to/from?
|app| supports the conversion of many input formats to many output formats. |app| supports the conversion of many input formats to many output formats.
It can convert every input format in the following list, to every output format. It can convert every input format in the following list, to every output format.
*Input Formats:* CBZ, CBR, CBC, CHM, DJVU, EPUB, FB2, HTML, HTMLZ, LIT, LRF, MOBI, ODT, PDF, PRC, PDB, PML, RB, RTF, SNB, TCR, TXT, TXTZ *Input Formats:* CBZ, CBR, CBC, CHM, DJVU, DOCX, EPUB, FB2, HTML, HTMLZ, LIT, LRF, MOBI, ODT, PDF, PRC, PDB, PML, RB, RTF, SNB, TCR, TXT, TXTZ
*Output Formats:* AZW3, EPUB, FB2, OEB, LIT, LRF, MOBI, HTMLZ, PDB, PML, RB, PDF, RTF, SNB, TCR, TXT, TXTZ *Output Formats:* AZW3, EPUB, FB2, OEB, LIT, LRF, MOBI, HTMLZ, PDB, PML, RB, PDF, RTF, SNB, TCR, TXT, TXTZ
@ -29,13 +29,14 @@ It can convert every input format in the following list, to every output format.
PRC is a generic format, |app| supports PRC files with TextRead and MOBIBook headers. PRC is a generic format, |app| supports PRC files with TextRead and MOBIBook headers.
PDB is also a generic format. |app| supports eReder, Plucker, PML and zTxt PDB files. PDB is also a generic format. |app| supports eReder, Plucker, PML and zTxt PDB files.
DJVU support is only for converting DJVU files that contain embedded text. These are typically generated by OCR software. DJVU support is only for converting DJVU files that contain embedded text. These are typically generated by OCR software.
MOBI books can be of two types Mobi6 and KF8. |app| fully supports both. MOBI files often have .azw or .azw3 file extensions MOBI books can be of two types Mobi6 and KF8. |app| fully supports both. MOBI files often have .azw or .azw3 file extensions.
DOCX files from Microsoft Word 2007 and newer are supported.
.. _best-source-formats: .. _best-source-formats:
What are the best source formats to convert? What are the best source formats to convert?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In order of decreasing preference: LIT, MOBI, AZW, EPUB, AZW3, FB2, HTML, PRC, RTF, PDB, TXT, PDF In order of decreasing preference: LIT, MOBI, AZW, EPUB, AZW3, FB2, DOCX, HTML, PRC, ODT, RTF, PDB, TXT, PDF
I converted a PDF file, but the result has various problems? I converted a PDF file, but the result has various problems?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~