Merge from trunk

2025-06-23 15:30:45 -04:00 · 2013-02-01 10:43:18 +01:00 · 2013-02-01 10:43:18 +01:00 · 82d12e70a4
commit 82d12e70a4
parent e17a19ba7b 9cba7ea0a5
100 changed files with 20140 additions and 19231 deletions
--- a/Changelog.yaml
+++ b/Changelog.yaml
@ -19,6 +19,51 @@
 #   new recipes:
 #     - title: 

+- version: 0.9.17
+  date: 2013-02-01
+
+  new features:
+    - title: "Allow adding user specified icons to the main book list for books whose metadata matches specific criteria. Go to Preferences->Look & Feel->Column icons to setup these icons. They work in the same way as the column coloring rules."
+      type: major
+
+    - title: "Allow choosing which page of a PDF to use as the cover."
+      description: "To access this functionality add the PDF to calibre then click the edit metadata button. In the top right area of the edit metadata dialog there is a button to get the cover from the ebook file, this will now allow you to choose which page (from the first ten pages) of the pdf to use as the cover."
+      tickets: [1110019]
+
+    - title: "Add option to turn off reflections in the cover browser (Preferences->Look & Feel->Cover Browser)"
+
+    - title: "PDF Output: Add an option to add page numbers to the bottom of every page in the generated PDF file (look in the PDF Output section of the conversion dialog)"
+
+    - title: "Add the full item name to the tool tip of a leaf item displayed in the tag browser."
+      tickets: [1106231]
+ 
+  bug fixes:
+    - title: "Fix out-of-bounds data causing errors in the Tag Browser"
+      tickets: [1108017]
+
+    - title: "Conversion: Handle input documents that use multiple prefixes referring to the XHTML namespace correctly."
+      tickets: [1107220]
+
+    - title: "PDF Output: Fix regression that caused some svg images to be rendered as black rectangles."
+      tickets: [1105294]
+
+    - title: "Metadata download: Only normalize title case if the result has no language set or its language is English"
+
+  improved recipes:
+    - Baltimore Sun
+    - Harvard Business Review
+    - Victoria Times
+    - South China Morning Post
+    - Volksrant
+    - Seattle Times
+
+  new recipes:
+    - title: Dob NeviNosti 
+      author: Darko Miletic
+
+    - title: La Nacion (CR) 
+      author: Douglas Delgado
+
 - version: 0.9.16
  date: 2013-01-25

--- a/manual/faq.rst
+++ b/manual/faq.rst
@ -550,9 +550,9 @@ Yes, you can. Follow the instructions in the answer above for adding custom colu

 How do I move my |app| library from one computer to another?
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-Simply copy the |app| library folder from the old to the new computer. You can find out what the library folder is by clicking the calibre icon in the toolbar. The very first item is the path to the library folder. Now on the new computer, start |app| for the first time. It will run the Welcome Wizard asking you for the location of the |app| library. Point it to the previously copied folder. If the computer you are transferring to already has a calibre installation, then the Welcome wizard wont run. In that case, click the calibre icon in the tooolbar and point it to the newly copied directory. You will now have two calibre libraries on your computer and you can switch between them by clicking the calibre icon on the toolbar.
+Simply copy the |app| library folder from the old to the new computer. You can find out what the library folder is by clicking the calibre icon in the toolbar. The very first item is the path to the library folder. Now on the new computer, start |app| for the first time. It will run the Welcome Wizard asking you for the location of the |app| library. Point it to the previously copied folder. If the computer you are transferring to already has a calibre installation, then the Welcome wizard wont run. In that case, right-click the |app| icon in the tooolbar and point it to the newly copied directory. You will now have two calibre libraries on your computer and you can switch between them by clicking the |app| icon on the toolbar. Transferring your library in this manner preserver all your metadata, tags, custom columns, etc.

-Note that if you are transferring between different types of computers (for example Windows to OS X) then after doing the above you should also right-click the calibre icon on the tool bar, select Library Maintenance and run the Check Library action. It will warn you about any problems in your library, which you should fix by hand.
+Note that if you are transferring between different types of computers (for example Windows to OS X) then after doing the above you should also right-click the |app| icon on the tool bar, select Library Maintenance and run the Check Library action. It will warn you about any problems in your library, which you should fix by hand.

 .. note:: A |app| library is just a folder which contains all the book files and their metadata. All the metadata is stored in a single file called metadata.db, in the top level folder. If this file gets corrupted, you may see an empty list of books in |app|. In this case you can ask |app| to restore your books by doing a right-click on the |app| icon in the toolbar and selecting Library Maintenance->Restore Library.

--- a/recipes/baltimore_sun.recipe
+++ b/recipes/baltimore_sun.recipe
@ -19,6 +19,7 @@ class BaltimoreSun(BasicNewsRecipe):
    use_embedded_content = False
    no_stylesheets       = True
    remove_javascript    = True
+    #auto_cleanup = True
    recursions           = 1

    ignore_duplicate_articles = {'title'}
@ -78,6 +79,7 @@ class BaltimoreSun(BasicNewsRecipe):
         #(u'High School', u'http://www.baltimoresun.com/sports/high-school/rss2.0.xml'),
         #(u'Outdoors', u'http://www.baltimoresun.com/sports/outdoors/rss2.0.xml'),

+
 ## Entertainment ##
         (u'Celebrity News', u'http://www.baltimoresun.com/entertainment/celebrities/rss2.0.xml'),
         (u'Arts & Theater', u'http://www.baltimoresun.com/entertainment/arts/rss2.0.xml'),
@ -142,12 +144,12 @@ class BaltimoreSun(BasicNewsRecipe):
         (u'Read Street', u'http://www.baltimoresun.com/features/books/read-street/rss2.0.xml'),
         (u'Z on TV', u'http://www.baltimoresun.com/entertainment/tv/z-on-tv-blog/rss2.0.xml'),

-## Life Blogs ##
-         (u'BMore Green', u'http://weblogs.baltimoresun.com/features/green/index.xml'),
-         (u'Baltimore Insider',u'http://www.baltimoresun.com/features/baltimore-insider-blog/rss2.0.xml'),
-         (u'Homefront', u'http://www.baltimoresun.com/features/parenting/homefront/rss2.0.xml'),
-         (u'Picture of Health', u'http://www.baltimoresun.com/health/blog/rss2.0.xml'),
-         (u'Unleashed', u'http://weblogs.baltimoresun.com/features/mutts/blog/index.xml'),
+### Life Blogs ##
+         #(u'BMore Green', u'http://weblogs.baltimoresun.com/features/green/index.xml'),
+         #(u'Baltimore Insider',u'http://www.baltimoresun.com/features/baltimore-insider-blog/rss2.0.xml'),
+         #(u'Homefront', u'http://www.baltimoresun.com/features/parenting/homefront/rss2.0.xml'),
+         #(u'Picture of Health', u'http://www.baltimoresun.com/health/blog/rss2.0.xml'),
+         #(u'Unleashed', u'http://weblogs.baltimoresun.com/features/mutts/blog/index.xml'),

 ## b the site blogs ##
         (u'Game Cache', u'http://www.baltimoresun.com/entertainment/bthesite/game-cache/rss2.0.xml'),
@ -167,6 +169,7 @@ class BaltimoreSun(BasicNewsRecipe):
             ]


+
    def get_article_url(self, article):
        ans = None
        try:
--- a/recipes/dobanevinosti.recipe
+++ b/recipes/dobanevinosti.recipe
@ -4,7 +4,7 @@ __copyright__ = '2013, Darko Miletic <darko.miletic at gmail.com>'
 '''
 dobanevinosti.blogspot.com
 '''
-
+import re
 from calibre.web.feeds.news import BasicNewsRecipe

 class DobaNevinosti(BasicNewsRecipe):
--- a/recipes/seattle_times.recipe
+++ b/recipes/seattle_times.recipe
@ -23,6 +23,7 @@ class SeattleTimes(BasicNewsRecipe):
    language = 'en'
    auto_cleanup          = True
    auto_cleanup_keep     = '//div[@id="PhotoContainer"]'
+    cover_url = 'http://seattletimes.com/PDF/frontpage.pdf'

    feeds              = [
                          (u'Top Stories',
--- a/src/calibre/constants.py
+++ b/src/calibre/constants.py
@ -4,7 +4,7 @@ __license__   = 'GPL v3'
 __copyright__ = '2008, Kovid Goyal kovid@kovidgoyal.net'
 __docformat__ = 'restructuredtext en'
 __appname__   = u'calibre'
-numeric_version = (0, 9, 16)
+numeric_version = (0, 9, 17)
 __version__   = u'.'.join(map(unicode, numeric_version))
 __author__    = u"Kovid Goyal <kovid@kovidgoyal.net>"

--- a/src/calibre/customize/init.py
+++ b/src/calibre/customize/init.py
@ -712,7 +712,7 @@ class ViewerPlugin(Plugin): # {{{

    def run_javascript(self, evaljs):
        '''
-        This method is called every time a document has finished laoding. Use
+        This method is called every time a document has finished loading. Use
        it in the same way as load_javascript().
        '''
        pass
--- a/src/calibre/ebooks/conversion/plugins/epub_input.py
+++ b/src/calibre/ebooks/conversion/plugins/epub_input.py
@ -9,6 +9,19 @@ from itertools import cycle
 from calibre.customize.conversion import InputFormatPlugin, OptionRecommendation

 ADOBE_OBFUSCATION =  'http://ns.adobe.com/pdf/enc#RC'
+IDPF_OBFUSCATION = 'http://www.idpf.org/2008/embedding'
+
+def decrypt_font(key, path, algorithm):
+    is_adobe = algorithm == ADOBE_OBFUSCATION
+    crypt_len = 1024 if is_adobe else 1040
+    with open(path, 'rb') as f:
+        raw = f.read()
+    crypt = bytearray(raw[:crypt_len])
+    key = cycle(iter(bytearray(key)))
+    decrypt = bytes(bytearray(x^key.next() for x in crypt))
+    with open(path, 'wb') as f:
+        f.write(decrypt)
+        f.write(raw[crypt_len:])

 class EPUBInput(InputFormatPlugin):

@ -20,18 +33,6 @@ class EPUBInput(InputFormatPlugin):

    recommendations = set([('page_breaks_before', '/', OptionRecommendation.MED)])

-    def decrypt_font(self, key, path, algorithm):
-        is_adobe = algorithm == ADOBE_OBFUSCATION
-        crypt_len = 1024 if is_adobe else 1040
-        with open(path, 'rb') as f:
-            raw = f.read()
-        crypt = bytearray(raw[:crypt_len])
-        key = cycle(iter(bytearray(key)))
-        decrypt = bytes(bytearray(x^key.next() for x in crypt))
-        with open(path, 'wb') as f:
-            f.write(decrypt)
-            f.write(raw[crypt_len:])
-
    def process_encryption(self, encfile, opf, log):
        from lxml import etree
        import uuid, hashlib
@ -58,8 +59,7 @@ class EPUBInput(InputFormatPlugin):
            root = etree.parse(encfile)
            for em in root.xpath('descendant::*[contains(name(), "EncryptionMethod")]'):
                algorithm = em.get('Algorithm', '')
-                if algorithm not in {ADOBE_OBFUSCATION,
-                        'http://www.idpf.org/2008/embedding'}:
+                if algorithm not in {ADOBE_OBFUSCATION, IDPF_OBFUSCATION}:
                    return False
                cr = em.getparent().xpath('descendant::*[contains(name(), "CipherReference")]')[0]
                uri = cr.get('URI')
@ -67,7 +67,7 @@ class EPUBInput(InputFormatPlugin):
                tkey = (key if algorithm == ADOBE_OBFUSCATION else idpf_key)
                if (tkey and os.path.exists(path)):
                    self._encrypted_font_uris.append(uri)
-                    self.decrypt_font(tkey, path, algorithm)
+                    decrypt_font(tkey, path, algorithm)
            return True
        except:
            import traceback
--- a/src/calibre/ebooks/metadata/sources/base.py
+++ b/src/calibre/ebooks/metadata/sources/base.py
@ -418,10 +418,12 @@ class Source(Plugin):
        before putting the Metadata object into result_queue. You can of
        course, use a custom algorithm suited to your metadata source.
        '''
-        if mi.title:
+        docase = mi.language == 'eng' or mi.is_null('language')
+        if docase and mi.title:
            mi.title = fixcase(mi.title)
        mi.authors = fixauthors(mi.authors)
-        mi.tags = list(map(fixcase, mi.tags))
+        if mi.tags and docase:
+            mi.tags = list(map(fixcase, mi.tags))
        mi.isbn = check_isbn(mi.isbn)

    # }}}
--- a/src/calibre/ebooks/oeb/polish/init.py
+++ b/src/calibre/ebooks/oeb/polish/init.py
@ -0,0 +1,11 @@
+#!/usr/bin/env python
+# vim:fileencoding=UTF-8:ts=4:sw=4:sta:et:sts=4:fdm=marker:ai
+from __future__ import (unicode_literals, division, absolute_import,
+                        print_function)
+
+__license__   = 'GPL v3'
+__copyright__ = '2013, Kovid Goyal <kovid at kovidgoyal.net>'
+__docformat__ = 'restructuredtext en'
+
+
+
--- a/src/calibre/ebooks/oeb/polish/container.py
+++ b/src/calibre/ebooks/oeb/polish/container.py
@ -0,0 +1,354 @@
+#!/usr/bin/env python
+# vim:fileencoding=UTF-8:ts=4:sw=4:sta:et:sts=4:fdm=marker:ai
+from __future__ import (unicode_literals, division, absolute_import,
+                        print_function)
+
+__license__   = 'GPL v3'
+__copyright__ = '2013, Kovid Goyal <kovid at kovidgoyal.net>'
+__docformat__ = 'restructuredtext en'
+
+import os, posixpath, logging, sys, hashlib, uuid
+from urllib import unquote as urlunquote
+
+from lxml import etree
+
+from calibre import guess_type, CurrentDir
+from calibre.ebooks.chardet import xml_to_unicode
+from calibre.ebooks.conversion.plugins.epub_input import (
+    ADOBE_OBFUSCATION, IDPF_OBFUSCATION, decrypt_font)
+from calibre.ebooks.conversion.preprocess import HTMLPreProcessor, CSSPreProcessor
+from calibre.ebooks.mobi import MobiError
+from calibre.ebooks.mobi.reader.headers import MetadataHeader
+from calibre.ebooks.oeb.base import OEB_DOCS, _css_logger, OEB_STYLES, OPF2_NS
+from calibre.ebooks.oeb.polish.errors import InvalidBook, DRMError
+from calibre.ebooks.oeb.parse_utils import NotHTML, parse_html, RECOVER_PARSER
+from calibre.ptempfile import PersistentTemporaryDirectory
+from calibre.utils.fonts.sfnt.container import Sfnt
+from calibre.utils.ipc.simple_worker  import fork_job, WorkerError
+from calibre.utils.logging import default_log
+from calibre.utils.zipfile import ZipFile
+
+exists, join, relpath = os.path.exists, os.path.join, os.path.relpath
+
+OEB_FONTS = {guess_type('a.ttf')[0], guess_type('b.ttf')[0]}
+
+class Container(object):
+
+    def __init__(self, rootpath, opfpath, log):
+        self.root = os.path.abspath(rootpath)
+        self.log = log
+        self.html_preprocessor = HTMLPreProcessor()
+        self.css_preprocessor = CSSPreProcessor()
+
+        self.parsed_cache = {}
+        self.mime_map = {}
+        self.name_path_map = {}
+
+        # Map of relative paths with '/' separators from root of unzipped ePub
+        # to absolute paths on filesystem with os-specific separators
+        opfpath = os.path.abspath(opfpath)
+        for dirpath, _dirnames, filenames in os.walk(self.root):
+            for f in filenames:
+                path = join(dirpath, f)
+                name = relpath(path, self.root).replace(os.sep, '/')
+                self.name_path_map[name] = path
+                self.mime_map[name] = guess_type(path)[0]
+                # Special case if we have stumbled onto the opf
+                if path == opfpath:
+                    self.opf_name = name
+                    self.opf_dir = posixpath.dirname(path)
+                    self.mime_map[name] = guess_type('a.opf')[0]
+
+        # Update mime map with data from the OPF
+        for item in self.opf.xpath(
+                '//opf:manifest/opf:item[@href and @media-type]',
+                namespaces={'opf':OPF2_NS}):
+            href = item.get('href')
+            self.mime_map[self.href_to_name(href)] = item.get('media-type')
+
+
+    def href_to_name(self, href, base=None):
+        if base is None:
+            base = self.opf_dir
+        href = urlunquote(href.partition('#')[0])
+        fullpath = posixpath.abspath(posixpath.join(base, href))
+        return self.relpath(fullpath)
+
+    def relpath(self, path):
+        return relpath(path, self.root)
+
+    def decode(self, data):
+        """Automatically decode :param:`data` into a `unicode` object."""
+        def fix_data(d):
+            return d.replace('\r\n', '\n').replace('\r', '\n')
+        if isinstance(data, unicode):
+            return fix_data(data)
+        bom_enc = None
+        if data[:4] in {b'\0\0\xfe\xff', b'\xff\xfe\0\0'}:
+            bom_enc = {b'\0\0\xfe\xff':'utf-32-be',
+                       b'\xff\xfe\0\0':'utf-32-le'}[data[:4]]
+            data = data[4:]
+        elif data[:2] in {b'\xff\xfe', b'\xfe\xff'}:
+            bom_enc = {b'\xff\xfe':'utf-16-le', b'\xfe\xff':'utf-16-be'}[data[:2]]
+            data = data[2:]
+        elif data[:3] == b'\xef\xbb\xbf':
+            bom_enc = 'utf-8'
+            data = data[3:]
+        if bom_enc is not None:
+            try:
+                return fix_data(data.decode(bom_enc))
+            except UnicodeDecodeError:
+                pass
+        try:
+            return fix_data(data.decode('utf-8'))
+        except UnicodeDecodeError:
+            pass
+        data, _ = xml_to_unicode(data)
+        return fix_data(data)
+
+    def parse_xml(self, data):
+        data = xml_to_unicode(data, strip_encoding_pats=True, assume_utf8=True,
+                             resolve_entities=True)[0].strip()
+        return etree.fromstring(data, parser=RECOVER_PARSER)
+
+    def parse_xhtml(self, data, fname):
+        try:
+            return parse_html(data, log=self.log,
+                    decoder=self.decode,
+                    preprocessor=self.html_preprocessor,
+                    filename=fname, non_html_file_tags={'ncx'})
+        except NotHTML:
+            return self.parse_xml(data)
+
+    def parse(self, path, mime):
+        with open(path, 'rb') as src:
+            data = src.read()
+        if mime in OEB_DOCS:
+            data = self.parse_xhtml(data, self.relpath(path))
+        elif mime[-4:] in {'+xml', '/xml'}:
+            data = self.parse_xml(data)
+        elif mime in OEB_STYLES:
+            data = self.parse_css(data, self.relpath(path))
+        elif mime in OEB_FONTS or path.rpartition('.')[-1].lower() in {'ttf', 'otf'}:
+            data = Sfnt(data)
+        return data
+
+    def parse_css(self, data, fname):
+        from cssutils import CSSParser, log
+        log.setLevel(logging.WARN)
+        log.raiseExceptions = False
+        data = self.decode(data)
+        data = self.css_preprocessor(data, add_namespace=False)
+        parser = CSSParser(loglevel=logging.WARNING,
+                           # We dont care about @import rules
+                           fetcher=lambda x: (None, None), log=_css_logger)
+        data = parser.parseString(data, href=fname, validate=False)
+        return data
+
+    def parsed(self, name):
+        ans = self.parsed_cache.get(name, None)
+        if ans is None:
+            mime = self.mime_map.get(name, guess_type(name)[0])
+            ans = self.parse(self.name_path_map[name], mime)
+            self.parsed_cache[name] = ans
+        return ans
+
+    @property
+    def opf(self):
+        return self.parsed(self.opf_name)
+
+    @property
+    def spine_items(self):
+        manifest_id_map = {item.get('id'):self.href_to_name(item.get('href'))
+            for item in self.opf.xpath('//opf:manifest/opf:item[@href and @id]',
+                namespaces={'opf':OPF2_NS})}
+
+        linear, non_linear = [], []
+        for item in self.opf.xpath('//opf:spine/opf:itemref[@idref]',
+                                   namespaces={'opf':OPF2_NS}):
+            idref = item.get('idref')
+            name = manifest_id_map.get(idref, None)
+            path = self.name_path_map.get(name, None)
+            if path:
+                if item.get('linear', 'yes') == 'yes':
+                    yield path
+                else:
+                    non_linear.append(path)
+        for path in non_linear:
+            yield path
+
+class InvalidEpub(InvalidBook):
+    pass
+
+OCF_NS = 'urn:oasis:names:tc:opendocument:xmlns:container'
+
+class EpubContainer(Container):
+
+    META_INF = {
+            'container.xml' : True,
+            'manifest.xml' : False,
+            'encryption.xml' : False,
+            'metadata.xml' : False,
+            'signatures.xml' : False,
+            'rights.xml' : False,
+    }
+
+    def __init__(self, pathtoepub, log):
+        self.pathtoepub = pathtoepub
+        tdir = self.root = PersistentTemporaryDirectory('_epub_container')
+        with open(self.pathtoepub, 'rb') as stream:
+            try:
+                zf = ZipFile(stream)
+                zf.extractall(tdir)
+            except:
+                log.exception('EPUB appears to be invalid ZIP file, trying a'
+                        ' more forgiving ZIP parser')
+                from calibre.utils.localunzip import extractall
+                stream.seek(0)
+                extractall(stream)
+        try:
+            os.remove(join(tdir, 'mimetype'))
+        except EnvironmentError:
+            pass
+
+        container_path = join(self.root, 'META-INF', 'container.xml')
+        if not exists(container_path):
+            raise InvalidEpub('No META-INF/container.xml in epub')
+        self.container = etree.fromstring(open(container_path, 'rb').read())
+        opf_files = self.container.xpath((
+            r'child::ocf:rootfiles/ocf:rootfile'
+            '[@media-type="%s" and @full-path]'%guess_type('a.opf')[0]
+            ), namespaces={'ocf':OCF_NS}
+        )
+        if not opf_files:
+            raise InvalidEpub('META-INF/container.xml contains no link to OPF file')
+        opf_path = os.path.join(self.root, *opf_files[0].get('full-path').split('/'))
+        if not exists(opf_path):
+            raise InvalidEpub('OPF file does not exist at location pointed to'
+                    ' by META-INF/container.xml')
+
+        super(EpubContainer, self).__init__(tdir, opf_path, log)
+
+        self.obfuscated_fonts = {}
+        if 'META-INF/encryption.xml' in self.name_path_map:
+            self.process_encryption()
+
+    def process_encryption(self):
+        fonts = {}
+        enc = self.parsed('META-INF/encryption.xml')
+        for em in enc.xpath('//*[local-name()="EncryptionMethod" and @Algorithm]'):
+            alg = em.get('Algorithm')
+            if alg not in {ADOBE_OBFUSCATION, IDPF_OBFUSCATION}:
+                raise DRMError()
+            cr = em.getparent().xpath('descendant::*[local-name()="CipherReference" and @URI]')[0]
+            name = self.href_to_name(cr.get('URI'), self.root)
+            path = self.name_path_map.get(name, None)
+            if path is not None:
+                fonts[name] = alg
+        if not fonts:
+            return
+
+        package_id = unique_identifier = idpf_key = None
+        for attrib, val in self.opf.attrib.iteritems():
+            if attrib.endswith('unique-identifier'):
+                package_id = val
+                break
+        if package_id is not None:
+            for elem in self.opf.xpath('//*[@id=%r]'%package_id):
+                if elem.text:
+                    unique_identifier = elem.text.rpartition(':')[-1]
+                    break
+        if unique_identifier is not None:
+            idpf_key = hashlib.sha1(unique_identifier).digest()
+        key = None
+        for item in self.opf.xpath('//*[local-name()="metadata"]/*'
+                                   '[local-name()="identifier"]'):
+            scheme = None
+            for xkey in item.attrib.keys():
+                if xkey.endswith('scheme'):
+                    scheme = item.get(xkey)
+            if (scheme and scheme.lower() == 'uuid') or \
+                    (item.text and item.text.startswith('urn:uuid:')):
+                try:
+                    key = bytes(item.text).rpartition(':')[-1]
+                    key = uuid.UUID(key).bytes
+                except:
+                    self.log.exception('Failed to parse obfuscation key')
+                    key = None
+
+        for font, alg in fonts.iteritems():
+            path = self.name_path_map[font]
+            tkey = key if alg == ADOBE_OBFUSCATION else idpf_key
+            if not tkey:
+                raise InvalidBook('Failed to find obfuscation key')
+            decrypt_font(tkey, path, alg)
+            self.obfuscated_fonts[name] = (alg, tkey)
+
+class InvalidMobi(InvalidBook):
+    pass
+
+def do_explode(path, dest):
+    from calibre.ebooks.mobi.reader.mobi6 import MobiReader
+    from calibre.ebooks.mobi.reader.mobi8 import Mobi8Reader
+    with open(path, 'rb') as stream:
+        mr = MobiReader(stream, default_log, None, None)
+
+        with CurrentDir(dest):
+            mr = Mobi8Reader(mr, default_log)
+            opf = os.path.abspath(mr())
+            obfuscated_fonts = mr.encrypted_fonts
+            try:
+                os.remove('debug-raw.html')
+            except:
+                pass
+
+    return opf, obfuscated_fonts
+
+class AZW3Container(Container):
+
+    def __init__(self, pathtoazw3, log):
+        self.pathtoazw3 = pathtoazw3
+        tdir = self.root = PersistentTemporaryDirectory('_azw3_container')
+        with open(pathtoazw3, 'rb') as stream:
+            raw = stream.read(3)
+            if raw == b'TPZ':
+                raise InvalidMobi(_('This is not a MOBI file. It is a Topaz file.'))
+
+            try:
+                header = MetadataHeader(stream, default_log)
+            except MobiError:
+                raise InvalidMobi(_('This is not a MOBI file.'))
+
+            if header.encryption_type != 0:
+                raise DRMError()
+
+            kf8_type = header.kf8_type
+
+            if kf8_type is None:
+                raise InvalidMobi(_('This MOBI file does not contain a KF8 format '
+                        'book. KF8 is the new format from Amazon. calibre can '
+                        'only edit MOBI files that contain KF8 books. Older '
+                        'MOBI files without KF8 are not editable.'))
+
+            if kf8_type == 'joint':
+                raise InvalidMobi(_('This MOBI file contains both KF8 and '
+                    'older Mobi6 data. calibre can only edit MOBI files '
+                    'that contain only KF8 data.'))
+
+        try:
+            opf_path, obfuscated_fonts = fork_job(
+            'calibre.ebooks.oeb.polish.container', 'do_explode',
+            args=(pathtoazw3, tdir), no_output=True)['result']
+        except WorkerError as e:
+            log(e.orig_tb)
+            raise InvalidMobi('Failed to explode MOBI')
+        super(AZW3Container, self).__init__(tdir, opf_path, log)
+        self.obfuscated_fonts = {x.replace(os.sep, '/') for x in obfuscated_fonts}
+
+if __name__ == '__main__':
+    f = sys.argv[-1]
+    ebook = (AZW3Container if f.rpartition('.')[-1].lower() in {'azw3', 'mobi'}
+            else EpubContainer)(f, default_log)
+    for s in ebook.spine_items:
+        print (ebook.relpath(s))
+
--- a/src/calibre/ebooks/oeb/polish/errors.py
+++ b/src/calibre/ebooks/oeb/polish/errors.py
@ -0,0 +1,18 @@
+#!/usr/bin/env python
+# vim:fileencoding=UTF-8:ts=4:sw=4:sta:et:sts=4:fdm=marker:ai
+from __future__ import (unicode_literals, division, absolute_import,
+                        print_function)
+
+__license__   = 'GPL v3'
+__copyright__ = '2013, Kovid Goyal <kovid at kovidgoyal.net>'
+__docformat__ = 'restructuredtext en'
+
+from calibre.ebooks import DRMError as _DRMError
+
+class InvalidBook(ValueError):
+    pass
+
+class DRMError(_DRMError):
+    def __init__(self):
+        super(DRMError, self).__init__(_('This file is locked with DRM. It cannot be edited.'))
+
--- a/src/calibre/ebooks/oeb/polish/stats.py
+++ b/src/calibre/ebooks/oeb/polish/stats.py
@ -0,0 +1,99 @@
+#!/usr/bin/env python
+# vim:fileencoding=UTF-8:ts=4:sw=4:sta:et:sts=4:fdm=marker:ai
+from __future__ import (unicode_literals, division, absolute_import,
+                        print_function)
+
+__license__   = 'GPL v3'
+__copyright__ = '2013, Kovid Goyal <kovid at kovidgoyal.net>'
+__docformat__ = 'restructuredtext en'
+
+import json
+
+from PyQt4.Qt import (QWebPage, pyqtProperty, QString, QEventLoop, QWebView,
+                      Qt, QSize, QTimer)
+
+from calibre.ebooks.oeb.display.webview import load_html
+from calibre.gui2 import must_use_qt
+
+class Page(QWebPage):
+
+    def __init__(self, log):
+        self.log = log
+        QWebPage.__init__(self)
+
+    def javaScriptConsoleMessage(self, msg, lineno, msgid):
+        self.log(u'JS:', unicode(msg))
+
+    def javaScriptAlert(self, frame, msg):
+        self.log(unicode(msg))
+
+    def shouldInterruptJavaScript(self):
+        return False
+
+    def _pass_json_value_getter(self):
+        val = json.dumps(self.bridge_value)
+        return QString(val)
+
+    def _pass_json_value_setter(self, value):
+        self.bridge_value = json.loads(unicode(value))
+
+    _pass_json_value = pyqtProperty(QString, fget=_pass_json_value_getter,
+            fset=_pass_json_value_setter)
+
+class StatsCollector(object):
+
+    def __init__(self, container):
+        self.container = container
+        self.log = self.logger = container.log
+        must_use_qt()
+
+        self.loop = QEventLoop()
+        self.view = QWebView()
+        self.page = Page(self.log)
+        self.view.setPage(self.page)
+        self.page.setViewportSize(QSize(1200, 1600))
+
+        self.view.loadFinished.connect(self.collect,
+                type=Qt.QueuedConnection)
+
+        self.render_queue = list(container.spine_items)
+        self.font_stats = {}
+
+        QTimer.singleShot(0, self.render_book)
+
+        if self.loop.exec_() == 1:
+            raise Exception('Failed to gather statistics from book, see log for details')
+
+    def render_book(self):
+        try:
+            if not self.render_queue:
+                self.loop.exit()
+            else:
+                self.render_next()
+        except:
+            self.logger.exception('Rendering failed')
+            self.loop.exit(1)
+
+    def render_next(self):
+        item = unicode(self.render_queue.pop(0))
+        self.current_item = item
+        load_html(item, self.view)
+
+    def collect(self, ok):
+        if not ok:
+            self.log.error('Failed to render document: %s'%self.container.relpath(self.current_item))
+            self.loop.exit(1)
+            return
+        try:
+            self.collect_font_stats()
+        except:
+            self.log.exception('Failed to collect font stats from: %s'%self.container.relpath(self.current_item))
+            self.loop.exit(1)
+            return
+
+        self.render_book()
+
+    def collect_font_stats(self):
+        pass
+
+
--- a/src/calibre/ebooks/pdf/render/from_html.py
+++ b/src/calibre/ebooks/pdf/render/from_html.py
@ -236,6 +236,7 @@ class PDFWriter(QObject):
            except:
                self.log.exception('Rendering failed')
                self.loop.exit(1)
+                return
        else:
            # The document is so corrupt that we can't render the page.
            self.logger.error('Document cannot be rendered.')
--- a/src/calibre/gui2/metadata/pdf_covers.py
+++ b/src/calibre/gui2/metadata/pdf_covers.py
@ -11,6 +11,7 @@ import sys, shutil, os
 from threading import Thread
 from glob import glob

+import sip
 from PyQt4.Qt import (QDialog, QApplication, QLabel, QGridLayout,
                      QDialogButtonBox, Qt, pyqtSignal, QListWidget,
                      QListWidgetItem, QSize, QIcon)
@ -21,6 +22,7 @@ from calibre.gui2 import error_dialog, file_icon_provider
 from calibre.ptempfile import PersistentTemporaryDirectory

 class PDFCovers(QDialog):
+    'Choose a cover from the first few pages of a PDF'

    rendering_done = pyqtSignal()

@ -76,7 +78,7 @@ class PDFCovers(QDialog):
            page_images(self.pdfpath, self.tdir, last=10)
        except Exception as e:
            self.error = as_unicode(e)
-        if self.isVisible():
+        if not sip.isdeleted(self) and self.isVisible():
            self.rendering_done.emit()

    def show_pages(self):
--- a/src/calibre/gui2/metadata/single.py
+++ b/src/calibre/gui2/metadata/single.py
@ -322,7 +322,7 @@ class MetadataSingleDialogBase(ResizableDialog):
        pdfpath = self.formats_manager.get_format_path(self.db, self.book_id,
                                                       'pdf')
        from calibre.gui2.metadata.pdf_covers import PDFCovers
-        d = self.__pdf_covers = PDFCovers(pdfpath, parent=self)
+        d = PDFCovers(pdfpath, parent=self)
        if d.exec_() == d.Accepted:
            cpath = d.cover_path
            if cpath:
--- a/src/calibre/gui2/preferences/coloring.py
+++ b/src/calibre/gui2/preferences/coloring.py
@ -306,11 +306,12 @@ class RuleEditor(QDialog): # {{{
            self.filename_box.setInsertPolicy(self.filename_box.InsertAlphabetically)
            d = os.path.join(config_dir, 'cc_icons')
            self.icon_file_names = []
-            for icon_file in os.listdir(d):
-                icon_file = lower(icon_file)
-                if os.path.exists(os.path.join(d, icon_file)):
-                    if icon_file.endswith('.png'):
-                        self.icon_file_names.append(icon_file)
+            if os.path.exists(d):
+                for icon_file in os.listdir(d):
+                    icon_file = lower(icon_file)
+                    if os.path.exists(os.path.join(d, icon_file)):
+                        if icon_file.endswith('.png'):
+                            self.icon_file_names.append(icon_file)
            self.icon_file_names.sort(key=sort_key)
            self.update_filename_box()

--- a/src/calibre/translations/af.po
+++ b/src/calibre/translations/af.po
--- a/src/calibre/translations/ar.po
+++ b/src/calibre/translations/ar.po
--- a/src/calibre/translations/ast.po
+++ b/src/calibre/translations/ast.po
--- a/src/calibre/translations/az.po
+++ b/src/calibre/translations/az.po
--- a/src/calibre/translations/ber.po
+++ b/src/calibre/translations/ber.po
--- a/src/calibre/translations/bg.po
+++ b/src/calibre/translations/bg.po
--- a/src/calibre/translations/bn.po
+++ b/src/calibre/translations/bn.po
--- a/src/calibre/translations/br.po
+++ b/src/calibre/translations/br.po
--- a/src/calibre/translations/bs.po
+++ b/src/calibre/translations/bs.po
--- a/src/calibre/translations/ca.po
+++ b/src/calibre/translations/ca.po
--- a/src/calibre/translations/calibre.pot
+++ b/src/calibre/translations/calibre.pot
--- a/src/calibre/translations/cs.po
+++ b/src/calibre/translations/cs.po
--- a/src/calibre/translations/cy.po
+++ b/src/calibre/translations/cy.po
--- a/src/calibre/translations/da.po
+++ b/src/calibre/translations/da.po
--- a/src/calibre/translations/de.po
+++ b/src/calibre/translations/de.po
--- a/src/calibre/translations/el.po
+++ b/src/calibre/translations/el.po
--- a/src/calibre/translations/en_AU.po
+++ b/src/calibre/translations/en_AU.po
--- a/src/calibre/translations/en_CA.po
+++ b/src/calibre/translations/en_CA.po
--- a/src/calibre/translations/en_GB.po
+++ b/src/calibre/translations/en_GB.po
--- a/src/calibre/translations/eo.po
+++ b/src/calibre/translations/eo.po
--- a/src/calibre/translations/es.po
+++ b/src/calibre/translations/es.po
--- a/src/calibre/translations/et.po
+++ b/src/calibre/translations/et.po
--- a/src/calibre/translations/eu.po
+++ b/src/calibre/translations/eu.po
--- a/src/calibre/translations/fa.po
+++ b/src/calibre/translations/fa.po
--- a/src/calibre/translations/fi.po
+++ b/src/calibre/translations/fi.po
--- a/src/calibre/translations/fo.po
+++ b/src/calibre/translations/fo.po
--- a/src/calibre/translations/fr.po
+++ b/src/calibre/translations/fr.po
--- a/src/calibre/translations/fr_CA.po
+++ b/src/calibre/translations/fr_CA.po
--- a/src/calibre/translations/fur.po
+++ b/src/calibre/translations/fur.po
--- a/src/calibre/translations/gl.po
+++ b/src/calibre/translations/gl.po
--- a/src/calibre/translations/gu.po
+++ b/src/calibre/translations/gu.po
--- a/src/calibre/translations/he.po
+++ b/src/calibre/translations/he.po
--- a/src/calibre/translations/hi.po
+++ b/src/calibre/translations/hi.po
--- a/src/calibre/translations/him.po
+++ b/src/calibre/translations/him.po
--- a/src/calibre/translations/hr.po
+++ b/src/calibre/translations/hr.po
--- a/src/calibre/translations/hu.po
+++ b/src/calibre/translations/hu.po
--- a/src/calibre/translations/id.po
+++ b/src/calibre/translations/id.po
--- a/src/calibre/translations/is.po
+++ b/src/calibre/translations/is.po
--- a/src/calibre/translations/it.po
+++ b/src/calibre/translations/it.po
--- a/src/calibre/translations/ja.po
+++ b/src/calibre/translations/ja.po
--- a/src/calibre/translations/jv.po
+++ b/src/calibre/translations/jv.po
--- a/src/calibre/translations/ka.po
+++ b/src/calibre/translations/ka.po
--- a/src/calibre/translations/kn.po
+++ b/src/calibre/translations/kn.po
--- a/src/calibre/translations/ko.po
+++ b/src/calibre/translations/ko.po
--- a/src/calibre/translations/ku.po
+++ b/src/calibre/translations/ku.po
--- a/src/calibre/translations/lt.po
+++ b/src/calibre/translations/lt.po
--- a/src/calibre/translations/ltg.po
+++ b/src/calibre/translations/ltg.po
--- a/src/calibre/translations/lv.po
+++ b/src/calibre/translations/lv.po
--- a/src/calibre/translations/mk.po
+++ b/src/calibre/translations/mk.po
--- a/src/calibre/translations/ml.po
+++ b/src/calibre/translations/ml.po
--- a/src/calibre/translations/mr.po
+++ b/src/calibre/translations/mr.po
--- a/src/calibre/translations/ms.po
+++ b/src/calibre/translations/ms.po
--- a/src/calibre/translations/nb.po
+++ b/src/calibre/translations/nb.po
--- a/src/calibre/translations/nds.po
+++ b/src/calibre/translations/nds.po
--- a/src/calibre/translations/nl.po
+++ b/src/calibre/translations/nl.po
--- a/src/calibre/translations/nn.po
+++ b/src/calibre/translations/nn.po
--- a/src/calibre/translations/oc.po
+++ b/src/calibre/translations/oc.po
--- a/src/calibre/translations/pa.po
+++ b/src/calibre/translations/pa.po
--- a/src/calibre/translations/pl.po
+++ b/src/calibre/translations/pl.po
--- a/src/calibre/translations/pt.po
+++ b/src/calibre/translations/pt.po
--- a/src/calibre/translations/pt_BR.po
+++ b/src/calibre/translations/pt_BR.po
--- a/src/calibre/translations/ro.po
+++ b/src/calibre/translations/ro.po
--- a/src/calibre/translations/ru.po
+++ b/src/calibre/translations/ru.po
--- a/src/calibre/translations/sc.po
+++ b/src/calibre/translations/sc.po
--- a/src/calibre/translations/si.po
+++ b/src/calibre/translations/si.po
--- a/src/calibre/translations/sk.po
+++ b/src/calibre/translations/sk.po
--- a/src/calibre/translations/sl.po
+++ b/src/calibre/translations/sl.po
--- a/src/calibre/translations/sq.po
+++ b/src/calibre/translations/sq.po
--- a/src/calibre/translations/sr.po
+++ b/src/calibre/translations/sr.po
--- a/src/calibre/translations/sr@latin.po
+++ b/src/calibre/translations/sr@latin.po
--- a/src/calibre/translations/sv.po
+++ b/src/calibre/translations/sv.po
--- a/src/calibre/translations/ta.po
+++ b/src/calibre/translations/ta.po
--- a/src/calibre/translations/te.po
+++ b/src/calibre/translations/te.po
--- a/src/calibre/translations/th.po
+++ b/src/calibre/translations/th.po
--- a/src/calibre/translations/tr.po
+++ b/src/calibre/translations/tr.po
--- a/src/calibre/translations/uk.po
+++ b/src/calibre/translations/uk.po
--- a/src/calibre/translations/ur.po
+++ b/src/calibre/translations/ur.po
--- a/src/calibre/translations/vi.po
+++ b/src/calibre/translations/vi.po
--- a/src/calibre/translations/wa.po
+++ b/src/calibre/translations/wa.po
--- a/src/calibre/translations/yi.po
+++ b/src/calibre/translations/yi.po
--- a/src/calibre/translations/zh_CN.po
+++ b/src/calibre/translations/zh_CN.po
--- a/src/calibre/translations/zh_HK.po
+++ b/src/calibre/translations/zh_HK.po
--- a/src/calibre/translations/zh_TW.po
+++ b/src/calibre/translations/zh_TW.po