From 09e5454e888952a4e38db9a2c93b5e3422aa6159 Mon Sep 17 00:00:00 2001 From: Kovid Goyal Date: Tue, 4 Jun 2013 16:40:26 +0530 Subject: [PATCH] Update documentation for DOCX support --- manual/conversion.rst | 35 +++++++++++++++++------------------ manual/faq.rst | 7 ++++--- 2 files changed, 21 insertions(+), 21 deletions(-) diff --git a/manual/conversion.rst b/manual/conversion.rst index 0747ffaba9..0ef14c5076 100644 --- a/manual/conversion.rst +++ b/manual/conversion.rst @@ -574,28 +574,27 @@ format, whether input or output are available in the conversion dialog under the Convert Microsoft Word documents ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -|app| does not directly convert .doc/.docx files from Microsoft Word. However, in Word, you can save the document -as HTML and then convert the resulting HTML file with |app|. When saving as HTML, be sure to use the -"Save as Web Page, Filtered" option as this will produce clean HTML that will convert well. Note that Word -produces really messy HTML, converting it can take a long time, so be patient. Another alternative is to -use the free OpenOffice. Open your .doc file in OpenOffice and save it in OpenOffice's format .odt. |app| can -directly convert .odt files. +|app| can automatically convert .docx files created by Microsoft Word 2007 and +newer. Just add the file to |app| and click convert. -There is a Word macro package that can automate the conversion of Word documents using |app|. It also makes -generating the Table of Contents much simpler. It is called BookCreator and is available for free -at `mobileread `_. +|app| will automatically generate a Table of Contents based on headings if you mark +your headings with the ``Heading 1``, ``Heading 2``, etc. styles in Word. Open +the output ebook in the calibre viewer and click the Table of Contents button +to view the generated Table of Contents. -An easy way to generate a Table of Contents when converting a Word document is: +Older .doc files +----------------- - 1. Mark your Chapters and sub-Chapters in the doc file with one of the MS built-in styles called 'Heading 1', 'Heading 2', ..., 'Heading 6'. 'Heading 1' equates to the HTML tag

, 'Heading 2' to

etc - - 2. Save the doc as Webpage-filtered (rather than Webpage) and import the html file into |app| - - 3. When you convert in |app| you use what you did in step 1 to set the box called 'Detect chapters at' on the Convert - Structure Detection page. For example: - - * If you mark Chapters with style 'Heading 2' then set the 'Detect chapters at' box to //h:h2 This will give you a proper external metadata TOC in the converted epub. - * A slightly more complex example...if your book has Sections and Chapters and you want a 2-level nested metadata TOC. Mark the doc Sections with style 'Heading 2' and the Chapters with style 'Heading 3'. When you convert set the 'Detect chapters at' box to //h:h2|//h:h3. On the Convert - TOC page set the 'Level 1 TOC' box to //h:h2 and the 'Level 2 TOC' box to //h:h3. +For older .doc files, you can save the document as HTML with Microsoft Word +and then convert the resulting HTML file with |app|. When saving as +HTML, be sure to use the "Save as Web Page, Filtered" option as this will +produce clean HTML that will convert well. Note that Word produces really messy +HTML, converting it can take a long time, so be patient. If you have a newer +version of Word available, you can directly save it as docx as well. +Another alternative is to use the free OpenOffice. Open your .doc file in +OpenOffice and save it in OpenOffice's format .odt. |app| can directly convert +.odt files. Convert TXT documents ~~~~~~~~~~~~~~~~~~~~~~ diff --git a/manual/faq.rst b/manual/faq.rst index 7f7b7cae00..bdac21a622 100644 --- a/manual/faq.rst +++ b/manual/faq.rst @@ -20,7 +20,7 @@ What formats does |app| support conversion to/from? |app| supports the conversion of many input formats to many output formats. It can convert every input format in the following list, to every output format. -*Input Formats:* CBZ, CBR, CBC, CHM, DJVU, EPUB, FB2, HTML, HTMLZ, LIT, LRF, MOBI, ODT, PDF, PRC, PDB, PML, RB, RTF, SNB, TCR, TXT, TXTZ +*Input Formats:* CBZ, CBR, CBC, CHM, DJVU, DOCX, EPUB, FB2, HTML, HTMLZ, LIT, LRF, MOBI, ODT, PDF, PRC, PDB, PML, RB, RTF, SNB, TCR, TXT, TXTZ *Output Formats:* AZW3, EPUB, FB2, OEB, LIT, LRF, MOBI, HTMLZ, PDB, PML, RB, PDF, RTF, SNB, TCR, TXT, TXTZ @@ -29,13 +29,14 @@ It can convert every input format in the following list, to every output format. PRC is a generic format, |app| supports PRC files with TextRead and MOBIBook headers. PDB is also a generic format. |app| supports eReder, Plucker, PML and zTxt PDB files. DJVU support is only for converting DJVU files that contain embedded text. These are typically generated by OCR software. - MOBI books can be of two types Mobi6 and KF8. |app| fully supports both. MOBI files often have .azw or .azw3 file extensions + MOBI books can be of two types Mobi6 and KF8. |app| fully supports both. MOBI files often have .azw or .azw3 file extensions. + DOCX files from Microsoft Word 2007 and newer are supported. .. _best-source-formats: What are the best source formats to convert? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -In order of decreasing preference: LIT, MOBI, AZW, EPUB, AZW3, FB2, HTML, PRC, RTF, PDB, TXT, PDF +In order of decreasing preference: LIT, MOBI, AZW, EPUB, AZW3, FB2, DOCX, HTML, PRC, ODT, RTF, PDB, TXT, PDF I converted a PDF file, but the result has various problems? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~