merge with John's branch

2025-08-30 23:00:21 -04:00 · 2012-03-27 19:55:10 +02:00 · 2012-03-27 19:55:10 +02:00 · 0194b4937d
commit 0194b4937d
parent fda1d71b7c dd27eb428a
162 changed files with 27932 additions and 23774 deletions
--- a/Changelog.yaml
+++ b/Changelog.yaml
@ -19,6 +19,62 @@
 #   new recipes:
 #     - title: 

+- version: 0.8.44
+  date: 2012-03-23
+
+  new features:
+    - title: "E-book viewer: A whole new full screen mode."
+      description: "The new mode has no toolbars to distract from the text and the ability to set the width of the column of text via Preferences in the ebook viewer. Click the Fullscreen button on the toolbar in the viewer to enter fullscreen mode (or press the F11 or Ctrl+Shit+F keys)"
+      type: major
+      tickets: [959830]
+
+    - title: "Copy to Library: If books were auto merged by the copy to library process, popup a message telling the user about it, as otherwise some people forget they have turned on auto merge and accuse calibre of losing their books."
+
+    - title: "Unix driver for Ectaco JetBook color"
+      tickets: [958442] 
+
+    - title: "Add a link to the 'Adding Books Preferences' in the drop down menu of the Add Books button for easier access and more prominence"
+      tickets: [958145] 
+
+    - title: "Smarten punctuation: Add a few more cases for detecting opening and closing quotes"
+ 
+  bug fixes:
+    - title: "Get Books: Updates to various store plugins to deal with website changes: Amazon Europe, Waterstones, Foyles, B&N, Kobo, Woblink and Empik"
+
+    - title: "Catalog generation: Do not error out when generating csv/xml catalogs if the catalog title contains filename invalid characters."
+      tickets: [960154]
+
+    - title: "RTF Output: Ignore corrupted images in the input document, instead of erroring out."
+      tickets: [959600]
+
+    - title: "E-book viewer: Try to preserve page position when the window is resized"
+
+    - title: "Fix bug that caused wrong series to be shown when clicking on the first letter of a series group in the Tag Browser"
+
+    - title: "Fix calibre not supporting different http and https proxies."
+      tickets: [960173]
+
+    - title: "MOBI Input: Fix regression caused by KF8 support that broke reading of ancient non-Amazon PRC files"
+
+    - title: "Fix EPUB to EPUB conversion of an EPUB with obfuscated fonts resulting in the fonts not being readable in Adobe Digital Editions"
+      tickets: [957527]
+
+    - title: "RTF Output: Fix bug that broke conversion to RTF when the input document contains <img> tags with no src attribute."
+
+    - title: "Fix regression in 0.8.43 that broke use of general mode templates that ended in a semi-colon."
+      tickets: [957295]
+
+  improved recipes:
+    - b92
+    - Various Polish news sources
+    - Le Monde
+    - FHM UK
+
+  new recipes:
+    - title: Ivana Milakovic and Klub knjige 
+      author: Darko Miletic
+
+
 - version: 0.8.43
  date: 2012-03-16

--- a/recipes/android_com_pl.recipe
+++ b/recipes/android_com_pl.recipe
@ -6,6 +6,7 @@ class Android_com_pl(BasicNewsRecipe):
    description   = 'Android.com.pl - biggest polish Android site'
    category       = 'Android, mobile'
    language       = 'pl'
+    use_embedded_content=True
    cover_url =u'http://upload.wikimedia.org/wikipedia/commons/thumb/d/d7/Android_robot.svg/220px-Android_robot.svg.png'
    oldest_article = 8
    max_articles_per_feed = 100
--- a/recipes/b92.recipe
+++ b/recipes/b92.recipe
@ -1,6 +1,6 @@

 __license__   = 'GPL v3'
-__copyright__ = '2008-2011, Darko Miletic <darko.miletic at gmail.com>'
+__copyright__ = '2008-2012, Darko Miletic <darko.miletic at gmail.com>'
 '''
 b92.net
 '''
@ -20,13 +20,13 @@ class B92(BasicNewsRecipe):
    encoding              = 'cp1250'
    language              = 'sr'
    publication_type      = 'newsportal'
-    masthead_url          = 'http://www.b92.net/images/fp/logo.gif'
+    masthead_url          = 'http://b92s.net/v4/img/new-logo.png'
    extra_css             = """ 
-                                @font-face {font-family: "serif1";src:url(res:///opt/sony/ebook/FONT/tt0011m_.ttf)}
                                @font-face {font-family: "sans1";src:url(res:///opt/sony/ebook/FONT/tt0003m_.ttf)}
                                body{font-family: Arial,Helvetica,sans1,sans-serif} 
-                                .articledescription{font-family: serif1, serif}
                                .article-info2,.article-info1{text-transform: uppercase; font-size: small}
+                                img{display: block}
+                                .sms{font-weight: bold}
                            """
    
    conversion_options = {
@ -37,11 +37,17 @@ class B92(BasicNewsRecipe):
                        , 'linearize_tables' : True
                        }
    
-    preprocess_regexps = [(re.compile(u'\u0110'), lambda match: u'\u00D0')]
+    preprocess_regexps = [
+                           (re.compile(u'\u0110'), lambda match: u'\u00D0'),
+                           (re.compile(r'<html.*?<body>', re.DOTALL|re.IGNORECASE), lambda match: '<html><head><title>something</title></head><body>')
+                         ]
    
    keep_only_tags    = [dict(attrs={'class':['article-info1','article-text']})]
-    remove_attributes = ['width','height','align','hspace','vspace','border']
-    remove_tags       = [dict(name=['embed','link','base','meta'])]
+    remove_attributes = ['width','height','align','hspace','vspace','border','lang','xmlns:fb']
+    remove_tags       = [
+                           dict(name=['embed','link','base','meta','iframe'])
+                          ,dict(attrs={'id':'social'})
+                        ]

    feeds          = [
                        (u'Vesti'      , u'http://www.b92.net/info/rss/vesti.xml'     )
--- a/recipes/cgm_pl.recipe
+++ b/recipes/cgm_pl.recipe
@ -1,4 +1,5 @@
 from calibre.web.feeds.news import BasicNewsRecipe
+from calibre.ebooks.BeautifulSoup import BeautifulSoup

 class CGM(BasicNewsRecipe):
    title          = u'CGM'
@ -17,9 +18,9 @@ class CGM(BasicNewsRecipe):
    remove_tags_before=dict(id='mainContent')
    remove_tags_after=dict(name='div', attrs={'class':'fbContainer'})
    remove_tags=[dict(name='div', attrs={'class':'fbContainer'}),
- 	      dict(name='p', attrs={'class':['tagCloud', 'galleryAuthor']}), 
-	      dict(id=['movieShare', 'container'])]
-    feeds          = [(u'Informacje', u'http://www.cgm.pl/rss.xml'), (u'Polecamy', u'http://www.cgm.pl/rss,4,news.xml'), 
+          dict(name='p', attrs={'class':['tagCloud', 'galleryAuthor']}),
+          dict(id=['movieShare', 'container'])]
+    feeds          = [(u'Informacje', u'http://www.cgm.pl/rss.xml'), (u'Polecamy', u'http://www.cgm.pl/rss,4,news.xml'),
                          (u'Recenzje', u'http://www.cgm.pl/rss,1,news.xml')]


@ -33,10 +34,12 @@ class CGM(BasicNewsRecipe):
                img='http://www.cgm.pl'+img[img.find('url(')+4:img.find(')')]
                gallery.contents[1].name='img'
                gallery.contents[1]['src']=img
+                pos = len(gallery.contents)
+                gallery.insert(pos, BeautifulSoup('<br />'))
        for item in soup.findAll(style=True):
            del item['style']
        ad=soup.findAll('a')
        for r in ad:
-            if 'www.hustla.pl' in r['href'] or 'www.ebilet.pl' in r['href']:                
+            if 'www.hustla.pl' in r['href'] or 'www.ebilet.pl' in r['href']:
                 r.extract()
-        return soup
+        return soup
--- a/recipes/elektroda_pl.recipe
+++ b/recipes/elektroda_pl.recipe
@ -1,4 +1,5 @@
 from calibre.web.feeds.news import BasicNewsRecipe
+from calibre.ebooks.BeautifulSoup import BeautifulSoup

 class Elektroda(BasicNewsRecipe):
    title          = u'Elektroda'
@ -13,3 +14,18 @@ class Elektroda(BasicNewsRecipe):
    remove_tags_after=dict(name='td', attrs={'class':'spaceRow'})
    remove_tags=[dict(name='a', attrs={'href':'#top'})]
    feeds          = [(u'Elektroda', u'http://www.elektroda.pl/rtvforum/rss.php')]
+
+
+    def preprocess_html(self, soup):
+        tag=soup.find('span', attrs={'class':'postbody'})
+        if tag:
+            pos = len(tag.contents)
+            tag.insert(pos, BeautifulSoup('<br />'))
+        return soup
+
+    def parse_feeds (self):
+      feeds = BasicNewsRecipe.parse_feeds(self)
+      for feed in feeds:
+        for article in feed.articles[:]:
+            article.title=article.title[article.title.find("::")+3:]
+      return feeds
--- a/recipes/fhm_uk.recipe
+++ b/recipes/fhm_uk.recipe
@ -3,10 +3,11 @@ from calibre.web.feeds.news import BasicNewsRecipe
 class AdvancedUserRecipe1325006965(BasicNewsRecipe):
    title          = u'FHM UK'
    description = 'Good News for Men'
-    cover_url = 'http://profile.ak.fbcdn.net/hprofile-ak-snc4/373529_38324934806_64930243_n.jpg'
+    cover_url = 'http://www.greatmagazines.co.uk/covers/large/w197/current/fhm.jpg'
+    #   cover_url = 'http://profile.ak.fbcdn.net/hprofile-ak-snc4/373529_38324934806_64930243_n.jpg'
    masthead_url = 'http://www.fhm.com/App_Resources/Images/Site/re-design/logo.gif'
    __author__ = 'Dave Asbury'
-    # last updated 27/1/12
+    # last updated 17/3/12
    language = 'en_GB'
    oldest_article = 28
    max_articles_per_feed = 12
@ -29,6 +30,8 @@ class AdvancedUserRecipe1325006965(BasicNewsRecipe):
    feeds          = [
    (u'From the Homepage',u'http://feed43.com/8053226782885416.xml'),
    (u'Funny - The Very Best Of The Internet',u'http://feed43.com/4538510106331565.xml'),
-    (u'The Final Countdown', u'http://feed43.com/3576106158530118.xml'),
-    (u'Gaming',u'http://feed43.com/0755006465351035.xml'),
-            ]
+                           (u'Upgrade',u'http://feed43.com/0877305847443234.xml'),
+    #(u'The Final Countdown', u'http://feed43.com/3576106158530118.xml'),
+    #(u'Gaming',u'http://feed43.com/0755006465351035.xml'),
+            (u'Gaming',u'http://feed43.com/6537162612465672.xml'),
+                           ]
--- a/recipes/film_web.recipe
+++ b/recipes/film_web.recipe
@ -13,7 +13,7 @@ class Filmweb_pl(BasicNewsRecipe):
    remove_empty_feeds=True
    extra_css      = '.hdrBig {font-size:22px;} ul {list-style-type:none; padding: 0; margin: 0;}'
    remove_tags= [dict(name='div', attrs={'class':['recommendOthers']}), dict(name='ul', attrs={'class':'fontSizeSet'})]
-    keep_only_tags= [dict(name='h1', attrs={'class':'hdrBig'}), dict(name='div', attrs={'class':['newsInfo', 'reviewContent fontSizeCont description']})]
+    keep_only_tags= [dict(name='h1', attrs={'class':['hdrBig', 'hdrEntity']}), dict(name='div', attrs={'class':['newsInfo', 'newsInfoSmall', 'reviewContent description']})]
    feeds          = [(u'Wszystkie newsy', u'http://www.filmweb.pl/feed/news/latest'),
                         (u'News / Filmy w produkcji', 'http://www.filmweb.pl/feed/news/category/filminproduction'),
                         (u'News / Festiwale, nagrody i przeglądy', u'http://www.filmweb.pl/feed/news/category/festival'),
--- a/recipes/gram_pl.recipe
+++ b/recipes/gram_pl.recipe
@ -9,12 +9,12 @@ class Gram_pl(BasicNewsRecipe):
    oldest_article = 8
    max_articles_per_feed = 100
    no_stylesheets= True
-    extra_css = 'h2 {font-style: italic;  font-size:20px;}'
+    extra_css = 'h2 {font-style: italic;  font-size:20px;} .picbox div {float: left;}'
    cover_url=u'http://www.gram.pl/www/01/img/grampl_zima.png'
    remove_tags= [dict(name='p', attrs={'class':['extraText', 'must-log-in']}), dict(attrs={'class':['el', 'headline', 'post-info']}), dict(name='div', attrs={'class':['twojaOcena', 'comment-body', 'comment-author vcard', 'comment-meta commentmetadata', 'tw_button']}), dict(id=['igit_rpwt_css', 'comments', 'reply-title', 'igit_title'])]
    keep_only_tags= [dict(name='div', attrs={'class':['main', 'arkh-postmetadataheader', 'arkh-postcontent', 'post', 'content', 'news_header', 'news_subheader', 'news_text']}), dict(attrs={'class':['contentheading', 'contentpaneopen']})]
-    feeds          = [(u'gram.pl - informacje', u'http://www.gram.pl/feed_news.asp'),
-	      (u'gram.pl - publikacje', u'http://www.gram.pl/feed_news.asp?type=articles')]
+    feeds          = [(u'Informacje', u'http://www.gram.pl/feed_news.asp'),
+	      (u'Publikacje', u'http://www.gram.pl/feed_news.asp?type=articles')]

    def parse_feeds (self): 
      feeds = BasicNewsRecipe.parse_feeds(self) 
@ -23,3 +23,33 @@ class Gram_pl(BasicNewsRecipe):
          if 'REKLAMA SKLEP' in article.title.upper() or u'ARTYKUŁ:' in article.title.upper():
            feed.articles.remove(article)
      return feeds
+
+    def append_page(self, soup, appendtag):
+        nexturl = appendtag.find('a', attrs={'class':'cpn'})
+        while nexturl:
+            soup2 = self.index_to_soup('http://www.gram.pl'+ nexturl['href'])
+            r=appendtag.find(id='pgbox')
+            if r:
+                r.extract()
+            pagetext = soup2.find(attrs={'class':'main'})
+            r=pagetext.find('h1')
+            if r:
+                r.extract()
+            r=pagetext.find('h2')
+            if r:
+                r.extract()
+            for r in pagetext.findAll('script'):
+                r.extract()
+            pos = len(appendtag.contents)
+            appendtag.insert(pos, pagetext)
+            nexturl = appendtag.find('a', attrs={'class':'cpn'})
+        r=appendtag.find(id='pgbox')
+        if r:
+           r.extract()
+
+    def preprocess_html(self, soup):
+        self.append_page(soup, soup.body)
+        tag=soup.findAll(name='div', attrs={'class':'picbox'})
+        for t in tag:
+            t['style']='float: left;'
+        return soup
--- a/recipes/icons/b92.png
+++ b/recipes/icons/b92.png
--- a/recipes/ivanamilakovic.recipe
+++ b/recipes/ivanamilakovic.recipe
@ -0,0 +1,43 @@
+# vim:fileencoding=UTF-8:ts=4:sw=4:sta:et:sts=4:ai
+
+__license__   = 'GPL v3'
+__copyright__ = '2012, Darko Miletic <darko.miletic at gmail.com>'
+'''
+ivanamilakovic.blogspot.com
+'''
+
+import re
+from calibre.web.feeds.news import BasicNewsRecipe
+
+class IvanaMilakovic(BasicNewsRecipe):
+    title                 = u'Ivana Milaković'
+    __author__            = 'Darko Miletic'
+    description           = u'Hronika mačijeg škrabala - priče, inspiracija, knjige, pisanje, prevodi...'
+    oldest_article        = 80
+    max_articles_per_feed = 100
+    language              = 'sr'
+    encoding              = 'utf-8'
+    no_stylesheets        = True
+    use_embedded_content  = True
+    publication_type      = 'blog'
+    extra_css             = """
+                               @font-face {font-family: "sans1";src:url(res:///opt/sony/ebook/FONT/tt0003m_.ttf)}
+                               body{font-family: Arial,Tahoma,Helvetica,FreeSans,sans1,sans-serif}
+                               img{margin-bottom: 0.8em; border: 1px solid #333333; padding: 4px }
+                            """
+
+    conversion_options = {
+                          'comment'  : description
+                        , 'tags'     : 'knjige, blog, srbija, sf'
+                        , 'publisher': 'Ivana Milakovic'
+                        , 'language' : language
+                        }
+
+    preprocess_regexps = [(re.compile(u'\u0110'), lambda match: u'\u00D0')]
+
+    feeds = [(u'Posts', u'http://ivanamilakovic.blogspot.com/feeds/posts/default')]
+
+    def preprocess_html(self, soup):
+        for item in soup.findAll(style=True):
+            del item['style']
+        return self.adeify_images(soup)
--- a/recipes/klubknjige.recipe
+++ b/recipes/klubknjige.recipe
@ -0,0 +1,42 @@
+
+__license__   = 'GPL v3'
+__copyright__ = '2012, Darko Miletic <darko.miletic at gmail.com>'
+'''
+klub-knjige.blogspot.com
+'''
+
+import re
+from calibre.web.feeds.news import BasicNewsRecipe
+
+class KlubKnjige(BasicNewsRecipe):
+    title                 = 'Klub knjige'
+    __author__            = 'Darko Miletic'
+    description           = 'literarni blog'    
+    oldest_article        = 30
+    max_articles_per_feed = 100
+    language              = 'sr'
+    encoding              = 'utf-8'
+    no_stylesheets        = True
+    use_embedded_content  = True
+    publication_type      = 'blog'    
+    extra_css             = """ 
+                               @font-face {font-family: "sans1";src:url(res:///opt/sony/ebook/FONT/tt0003m_.ttf)} 
+                               body{font-family: Arial,Tahoma,Helvetica,FreeSans,sans1,sans-serif} 
+                               img{margin-bottom: 0.8em; border: 1px solid #333333; padding: 4px } 
+                            """
+
+    conversion_options = {
+                          'comment'  : description
+                        , 'tags'     : 'knjige, blog, srbija, sf'
+                        , 'publisher': 'Klub Knjige'
+                        , 'language' : language
+                        }
+
+    preprocess_regexps = [(re.compile(u'\u0110'), lambda match: u'\u00D0')]
+
+    feeds = [(u'Posts', u'http://klub-knjige.blogspot.com/feeds/posts/default')]
+
+    def preprocess_html(self, soup):
+        for item in soup.findAll(style=True):
+            del item['style']
+        return self.adeify_images(soup)
--- a/recipes/le_monde.recipe
+++ b/recipes/le_monde.recipe
@ -3,7 +3,6 @@ __copyright__ = '2011'
 '''
 lemonde.fr
 '''
-import re
 from calibre.web.feeds.recipes import BasicNewsRecipe

 class LeMonde(BasicNewsRecipe):
@ -41,77 +40,8 @@ class LeMonde(BasicNewsRecipe):

    remove_empty_feeds = True

-    filterDuplicates = True
+    auto_cleanup = True

-    def preprocess_html(self, soup):
-        for alink in soup.findAll('a'):
-            if alink.string is not None:
-               tstr = alink.string
-               alink.replaceWith(tstr)
-        return self.adeify_images(soup)
-
-    preprocess_regexps = [
-        (re.compile(r'([0-9])%'), lambda m: m.group(1) + '&nbsp;%'),
-        (re.compile(r'([0-9])([0-9])([0-9]) ([0-9])([0-9])([0-9])'), lambda m: m.group(1) + m.group(2) + m.group(3) + '&nbsp;' + m.group(4) + m.group(5) + m.group(6)),
-        (re.compile(r'([0-9]) ([0-9])([0-9])([0-9])'), lambda m: m.group(1) + '&nbsp;' + m.group(2) + m.group(3) + m.group(4)),
-        (re.compile(r'<span>'), lambda match: ' <span>'),
-        (re.compile(r'\("'), lambda match: '(&laquo;&nbsp;'),
-        (re.compile(r'"\)'), lambda match: '&nbsp;&raquo;)'),
-        (re.compile(r'&ldquo;'), lambda match: '(&laquo;&nbsp;'),
-        (re.compile(r'&rdquo;'), lambda match: '&nbsp;&raquo;)'),
-        (re.compile(r'>\''), lambda match: '>&lsquo;'),
-        (re.compile(r' \''), lambda match: ' &lsquo;'),
-        (re.compile(r'\''), lambda match: '&rsquo;'),
-        (re.compile(r'"<em>'), lambda match: '<em>&laquo;&nbsp;'),
-        (re.compile(r'"<em>"</em><em>'), lambda match: '<em>&laquo;&nbsp;'),
-        (re.compile(r'"<a href='), lambda match: '&laquo;&nbsp;<a href='),
-        (re.compile(r'</em>"'), lambda match: '&nbsp;&raquo;</em>'),
-        (re.compile(r'</a>"'), lambda match: '&nbsp;&raquo;</a>'),
-        (re.compile(r'"</'), lambda match: '&nbsp;&raquo;</'),
-        (re.compile(r'>"'), lambda match: '>&laquo;&nbsp;'),
-        (re.compile(r'"<'), lambda match: '&nbsp;&raquo;<'),
-        (re.compile(r'&rsquo;"'), lambda match: '&rsquo;«&nbsp;'),
-        (re.compile(r' "'), lambda match: ' &laquo;&nbsp;'),
-        (re.compile(r'" '), lambda match: '&nbsp;&raquo; '),
-        (re.compile(r'"\.'), lambda match: '&nbsp;&raquo;.'),
-        (re.compile(r'",'), lambda match: '&nbsp;&raquo;,'),
-        (re.compile(r'"\?'), lambda match: '&nbsp;&raquo;?'),
-        (re.compile(r'":'), lambda match: '&nbsp;&raquo;:'),
-        (re.compile(r'";'), lambda match: '&nbsp;&raquo;;'),
-        (re.compile(r'"\!'), lambda match: '&nbsp;&raquo;!'),
-        (re.compile(r' :'), lambda match: '&nbsp;:'),
-        (re.compile(r' ;'), lambda match: '&nbsp;;'),
-        (re.compile(r' \?'), lambda match: '&nbsp;?'),
-        (re.compile(r' \!'), lambda match: '&nbsp;!'),
-        (re.compile(r'\s»'), lambda match: '&nbsp;»'),
-        (re.compile(r'«\s'), lambda match: '«&nbsp;'),
-        (re.compile(r' %'), lambda match: '&nbsp;%'),
-        (re.compile(r'\.jpg&nbsp;&raquo; border='), lambda match: '.jpg'),
-        (re.compile(r'\.png&nbsp;&raquo; border='), lambda match: '.png'),
-        (re.compile(r' &ndash; '), lambda match: '&nbsp;&ndash; '),
-        (re.compile(r' – '), lambda match: '&nbsp;&ndash; '),
-        (re.compile(r' - '), lambda match: '&nbsp;&ndash; '),
-        (re.compile(r' -,'), lambda match: '&nbsp;&ndash;,'),
-        (re.compile(r'&raquo;:'), lambda match: '&raquo;&nbsp;:'),
-        ]
-
-
-    keep_only_tags    = [
-                       dict(name='div', attrs={'class':['contenu']})
-                        ]
-    remove_tags = [dict(name='div', attrs={'class':['LM_atome']})]
-    remove_tags_after = [dict(id='appel_temoignage')]
-
-    def get_article_url(self, article):
-          url = article.get('guid', None)
-          if '/chat/' in url or '.blog' in url or '/video/' in url or '/sport/' in url or '/portfolio/' in url or '/visuel/' in url :
-              url = None
-          return url
-
-#    def get_article_url(self, article):
-#        link = article.get('link')
-#        if 'blog' not in link and ('chat' not in link):
-#             return link

    feeds          = [
                      ('A la une', 'http://www.lemonde.fr/rss/une.xml'),
@ -137,3 +67,10 @@ class LeMonde(BasicNewsRecipe):

        return cover_url

+    def get_article_url(self, article):
+        url = article.get('guid', None)
+        if '/chat/' in url or '.blog' in url or '/video/' in url or '/sport/' in url or '/portfolio/' in url or '/visuel/' in url :
+            url = None
+        return url
+
+
--- a/recipes/marketing_magazine.recipe
+++ b/recipes/marketing_magazine.recipe
@ -1,4 +1,6 @@
 __license__   = 'GPL v3'
+__author__    = 'faber1971'
+description   = 'Collection of Italian marketing websites - v1.04 (17, March 2012)'

 from calibre.web.feeds.news import BasicNewsRecipe

@ -9,12 +11,9 @@ class AdvancedUserRecipe1327062445(BasicNewsRecipe):
    auto_cleanup = True
    remove_javascript = True
    no_stylesheets = True
+    conversion_options = {'linearize_tables': True}
    remove_tags = [
                     dict(name='ul', attrs={'id':'ads0'})
                  ]
    masthead_url            = 'http://www.simrendeogun.com/wp-content/uploads/2011/06/New-Marketing-Magazine-Logo.jpg'
-    __author__    = 'faber1971'
-    description   = 'Collection of Italian marketing websites - v1.03 (20, February 2012)'
-    language = 'it'
-
-    feeds          = [(u'My Marketing', u'http://feed43.com/0537744466058428.xml'), (u'My Marketing_', u'http://feed43.com/8126723074604845.xml'), (u'Venturini', u'http://robertoventurini.blogspot.com/feeds/posts/default?alt=rss'), (u'Ninja Marketing', u'http://feeds.feedburner.com/NinjaMarketing'), (u'Comunitàzione', u'http://www.comunitazione.it/feed/novita.asp'), (u'Brandforum news', u'http://www.brandforum.it/rss/news'), (u'Brandforum papers', u'http://www.brandforum.it/rss/papers'), (u'MarketingArena', u'http://feeds.feedburner.com/marketingarena'), (u'minimarketing', u'http://feeds.feedburner.com/minimarketingit'), (u'Disambiguando', u'http://giovannacosenza.wordpress.com/feed/')]
+    feeds          = [(u'My Marketing', u'http://feed43.com/0537744466058428.xml'), (u'My Marketing_', u'http://feed43.com/8126723074604845.xml'), (u'Venturini', u'http://robertoventurini.blogspot.com/feeds/posts/default?alt=rss'), (u'Ninja Marketing', u'http://feeds.feedburner.com/NinjaMarketing'), (u'Comunitàzione', u'http://www.comunitazione.it/feed/novita.asp'), (u'Brandforum news', u'http://www.brandforum.it/rss/news'), (u'Brandforum papers', u'http://www.brandforum.it/rss/papers'), (u'MarketingArena', u'http://feeds.feedburner.com/marketingarena'), (u'minimarketing', u'http://feeds.feedburner.com/minimarketingit'), (u'Marketing Journal', u'http://feeds.feedburner.com/marketingjournal/jPwA'), (u'Disambiguando', u'http://giovannacosenza.wordpress.com/feed/')]
--- a/recipes/naczytniki.recipe
+++ b/recipes/naczytniki.recipe
@ -7,12 +7,12 @@ class naczytniki(BasicNewsRecipe):
    cover_url      = 'http://naczytniki.pl/wp-content/uploads/2010/08/logo_nc28.png'
    language       = 'pl'
    description ='everything about e-readers'
-    category='readers'
+    category='e-readers'
    no_stylesheets=True
+    use_embedded_content=False
    oldest_article = 7
    max_articles_per_feed = 100
    preprocess_regexps = [(re.compile(ur'<p><br><b>Zobacz także:</b></p>.*?</body>', re.DOTALL), lambda match: '</body>') ]
-    remove_tags_after= dict(name='div', attrs={'class':'sociable'})
    keep_only_tags=[dict(name='div', attrs={'class':'post'})]
    remove_tags=[dict(name='span', attrs={'class':'comments'}), dict(name='div', attrs={'class':'sociable'})]
-    feeds          = [(u'Wpisy', u'http://naczytniki.pl/?feed=rss2')]
+    feeds          = [(u'Wpisy', u'http://naczytniki.pl/?feed=rss2')]
--- a/recipes/overclock_pl.recipe
+++ b/recipes/overclock_pl.recipe
@ -17,21 +17,8 @@ class Overclock_pl(BasicNewsRecipe):
    remove_tags=[dict(name='span', attrs={'class':'info'}), dict(attrs={'class':'shareit'})]
    feeds          = [(u'Aktualno\u015bci', u'http://www.overclock.pl/rss.news.xml'), (u'Testy i recenzje', u'http://www.overclock.pl/rss.articles.xml')]

-
-    def append_page(self, soup, appendtag):
-        tag=soup.find(id='navigation')
-        if tag:
-            nexturl=tag.findAll('option')
-            tag.extract()
-            for nextpage in nexturl[2:]:
-               soup2 = self.index_to_soup(nextpage['value'])
-               pagetext = soup2.find(id='content')
-               pos = len(appendtag.contents)
-               appendtag.insert(pos, pagetext)
-            rem=appendtag.find(attrs={'alt':'Pierwsza'})
-            if rem:
-                rem.parent.extract()
-
-    def preprocess_html(self, soup):
-        self.append_page(soup, soup.body)
-        return soup
+    def print_version(self, url):
+        if 'articles/show' in url:
+            return url.replace('show', 'showall')
+        else:
+            return url
--- a/recipes/palmtop_pl.recipe
+++ b/recipes/palmtop_pl.recipe
@ -10,5 +10,7 @@ class palmtop_pl(BasicNewsRecipe):
    oldest_article = 7
    max_articles_per_feed = 100
    no_stylesheets = True
-
+    use_embedded_content=True
+    #remove_tags_before=dict(name='h2')
+    #remove_tags_after=dict(attrs={'class':'entry clearfix'})
    feeds          = [(u'Newsy', u'http://palmtop.pl/feed/atom/')]
--- a/recipes/pc_arena.recipe
+++ b/recipes/pc_arena.recipe
@ -1,31 +1,25 @@
 from calibre.web.feeds.news import BasicNewsRecipe
 class PC_Arena(BasicNewsRecipe):
    title          = u'PCArena'
-    oldest_article = 18300
+    oldest_article = 7
    max_articles_per_feed = 100
    __author__        = 'fenuks'
    description   = u'Najnowsze informacje z branży IT - testy, recenzje, aktualności, rankingi, wywiady. Twoje źródło informacji o sprzęcie komputerowym.'
    category       = 'IT'
    language       = 'pl'
-    masthead_url='http://pcarena.pl/public/design/frontend/images/logo.gif'
-    cover_url= 'http://pcarena.pl/public/design/frontend/images/logo.gif'
+    masthead_url='http://pcarena.pl/pcarena/img/logo.png'
+    cover_url= 'http://pcarena.pl/pcarena/img/logo.png'
    no_stylesheets = True
-    keep_only_tags=[dict(attrs={'class':['artHeader', 'art']})]
-    remove_tags=[dict(attrs={'class':'pages'})]
-    feeds          = [(u'Newsy', u'http://pcarena.pl/misc/rss/news'), (u'Artyku\u0142y', u'http://pcarena.pl/misc/rss/articles')]
+    remove_empty_feeds=True
+    #keep_only_tags=[dict(attrs={'class':['artHeader', 'art']})]
+    #remove_tags=[dict(attrs={'class':'pages'})]
+    feeds          = [(u'Aktualności', u'http://pcarena.pl/aktualnosci/feeds.rss'), (u'Testy', u'http://pcarena.pl/testy/feeds.rss'), (u'Software', u'http://pcarena.pl/oprogramowanie/feeds.rss'), (u'Poradniki', u'http://pcarena.pl/poradniki/feeds.rss'), (u'Mobile', u'http://pcarena.pl/mobile/feeds.rss')]
+    
+    def print_version(self, url):
+        return url.replace('show', 'print')

-    def append_page(self, soup, appendtag):
-        tag=soup.find(name='div', attrs={'class':'pagNum'})
-        if tag:
-            nexturl=tag.findAll('a')
-            tag.extract()
-            for nextpage in nexturl[1:]:
-               nextpage= 'http://pcarena.pl' + nextpage['href']
-               soup2 = self.index_to_soup(nextpage)
-               pagetext = soup2.find(attrs={'class':'artBody'})
-               pos = len(appendtag.contents)
-               appendtag.insert(pos, pagetext)
-
-    def preprocess_html(self, soup):
-        self.append_page(soup, soup.body)
-        return soup
+    def image_url_processor(self, baseurl, url):
+        if 'http' not in url:
+            return 'http://pcarena.pl' + url
+        else:
+            return url
--- a/recipes/pc_centre_pl.recipe
+++ b/recipes/pc_centre_pl.recipe
@ -10,32 +10,11 @@ class PC_Centre(BasicNewsRecipe):
    masthead_url= 'http://pccentre.pl/views/images/logo.gif'
    cover_url= 'http://pccentre.pl/views/images/logo.gif'
    no_stylesheets = True
-    keep_only_tags= [dict(id='content')]
-    remove_tags=[dict(attrs={'class':['ikony r', 'list_of_content', 'dot accordion']}), dict(id='comments')]
-    feeds          = [(u'Publikacje', u'http://pccentre.pl/backend.php?mode=a'), (u'Aktualno\u015bci', u'http://pccentre.pl/backend.php'), (u'Sprz\u0119t komputerowy', u'http://pccentre.pl/backend.php?mode=n&section=2'), (u'Oprogramowanie', u'http://pccentre.pl/backend.php?mode=n&section=3'), (u'Gry komputerowe i konsole', u'http://pccentre.pl/backend.php?mode=n&section=4'), (u'Internet', u'http://pccentre.pl/backend.php?mode=n&section=7'), (u'Bezpiecze\u0144stwo', u'http://pccentre.pl/backend.php?mode=n&section=5'), (u'Multimedia', u'http://pccentre.pl/backend.php?mode=n&section=6'), (u'Biznes', u'http://pccentre.pl/backend.php?mode=n&section=9')]
+    remove_empty_feeds = True
+    #keep_only_tags= [dict(id='content')]
+    #remove_tags=[dict(attrs={'class':['ikony r', 'list_of_content', 'dot accordion']}), dict(id='comments')]
+    remove_tags=[dict(attrs={'class':'logo_print'})]
+    feeds          = [(u'Aktualno\u015bci', u'http://pccentre.pl/backend.php'), (u'Publikacje', u'http://pccentre.pl/backend.php?mode=a'), (u'Sprz\u0119t komputerowy', u'http://pccentre.pl/backend.php?mode=n&section=2'), (u'Oprogramowanie', u'http://pccentre.pl/backend.php?mode=n&section=3'), (u'Gry komputerowe i konsole', u'http://pccentre.pl/backend.php?mode=n&section=4'), (u'Internet', u'http://pccentre.pl/backend.php?mode=n&section=7'), (u'Bezpiecze\u0144stwo', u'http://pccentre.pl/backend.php?mode=n&section=5'), (u'Multimedia', u'http://pccentre.pl/backend.php?mode=n&section=6'), (u'Biznes', u'http://pccentre.pl/backend.php?mode=n&section=9')]

-
-    def append_page(self, soup, appendtag):
-        tag=soup.find(name='div', attrs={'class':'pages'})
-        if tag:
-            nexturl=tag.findAll('a')
-            tag.extract()
-            for nextpage in nexturl[:-1]:
-               nextpage= 'http://pccentre.pl' + nextpage['href']
-               soup2 = self.index_to_soup(nextpage)
-               pagetext = soup2.find(id='content')
-               rem=pagetext.findAll(attrs={'class':['subtitle', 'content_info', 'list_of_content', 'pages', 'social2', 'pcc_acc', 'pcc_acc_na']})
-               for r in rem:
-                   r.extract()
-               rem=pagetext.findAll(id='comments')
-               for r in rem:
-                   r.extract()
-               rem=pagetext.findAll('h1')
-               for r in rem:
-                   r.extract()
-               pos = len(appendtag.contents)
-               appendtag.insert(pos, pagetext)
-
-    def preprocess_html(self, soup):
-        self.append_page(soup, soup.body)
-        return soup
+    def print_version(self, url):
+        return url.replace('show', 'print')
--- a/recipes/tablety_pl.recipe
+++ b/recipes/tablety_pl.recipe
@ -8,10 +8,11 @@ class Tablety_pl(BasicNewsRecipe):
    cover_url      = 'http://www.tablety.pl/wp-content/themes/kolektyw/img/logo.png'
    category       = 'IT'
    language       = 'pl'
+    use_embedded_content=True
    oldest_article = 8
    max_articles_per_feed = 100
    preprocess_regexps = [(re.compile(ur'<p><strong>Przeczytaj także.*?</a></strong></p>', re.DOTALL), lambda match: ''), (re.compile(ur'<p><strong>Przeczytaj koniecznie.*?</a></strong></p>', re.DOTALL), lambda match: '')]
-    remove_tags_before=dict(name="h1", attrs={'class':'entry-title'})
-    remove_tags_after=dict(name="div", attrs={'class':'snap_nopreview sharing robots-nocontent'})
-    remove_tags=[dict(name='div', attrs={'class':'snap_nopreview sharing robots-nocontent'})]
+    #remove_tags_before=dict(name="h1", attrs={'class':'entry-title'})
+    #remove_tags_after=dict(name="footer", attrs={'class':'entry-footer clearfix'})
+    #remove_tags=[dict(name='footer', attrs={'class':'entry-footer clearfix'}), dict(name='div', attrs={'class':'entry-comment-counter'})]
    feeds          = [(u'Najnowsze posty', u'http://www.tablety.pl/feed/')]
--- a/recipes/wnp.recipe
+++ b/recipes/wnp.recipe
@ -1,5 +1,5 @@
 from calibre.web.feeds.news import BasicNewsRecipe
-
+import re

 class AdvancedUserRecipe1312886443(BasicNewsRecipe):
    title          = u'WNP'
@ -8,10 +8,11 @@ class AdvancedUserRecipe1312886443(BasicNewsRecipe):
    description   = u'Wirtualny Nowy Przemysł'
    category       = 'economy'
    language       = 'pl'
+    preprocess_regexps = [(re.compile(ur'Czytaj też:.*?</a>', re.DOTALL), lambda match: ''), (re.compile(ur'Czytaj więcej:.*?</a>', re.DOTALL), lambda match: '')]
    oldest_article = 8
    max_articles_per_feed = 100
    no_stylesheets= True
-    keep_only_tags = dict(name='div', attrs={'id':'contentText'})
+    remove_tags=[dict(attrs={'class':'printF'})]
    feeds          = [(u'Wiadomości gospodarcze', u'http://www.wnp.pl/rss/serwis_rss.xml'),
                          (u'Serwis Energetyka - Gaz', u'http://www.wnp.pl/rss/serwis_rss_1.xml'),
          (u'Serwis Nafta - Chemia', u'http://www.wnp.pl/rss/serwis_rss_2.xml'),
@ -19,3 +20,7 @@ class AdvancedUserRecipe1312886443(BasicNewsRecipe):
          (u'Serwis Górnictwo', u'http://www.wnp.pl/rss/serwis_rss_4.xml'),
          (u'Serwis Logistyka', u'http://www.wnp.pl/rss/serwis_rss_5.xml'),
          (u'Serwis IT', u'http://www.wnp.pl/rss/serwis_rss_6.xml')]
+
+
+    def print_version(self, url):
+        return 'http://wnp.pl/drukuj/' +url[url.find(',')+1:]
--- a/resources/fonts/liberation/LiberationMono-Bold.ttf
+++ b/resources/fonts/liberation/LiberationMono-Bold.ttf
--- a/resources/fonts/liberation/LiberationMono-BoldItalic.ttf
+++ b/resources/fonts/liberation/LiberationMono-BoldItalic.ttf
--- a/resources/fonts/liberation/LiberationMono-Italic.ttf
+++ b/resources/fonts/liberation/LiberationMono-Italic.ttf
--- a/resources/fonts/liberation/LiberationMono-Regular.ttf
+++ b/resources/fonts/liberation/LiberationMono-Regular.ttf
--- a/resources/fonts/liberation/LiberationSans-Bold.ttf
+++ b/resources/fonts/liberation/LiberationSans-Bold.ttf
--- a/resources/fonts/liberation/LiberationSans-BoldItalic.ttf
+++ b/resources/fonts/liberation/LiberationSans-BoldItalic.ttf
--- a/resources/fonts/liberation/LiberationSans-Italic.ttf
+++ b/resources/fonts/liberation/LiberationSans-Italic.ttf
--- a/resources/fonts/liberation/LiberationSans-Regular.ttf
+++ b/resources/fonts/liberation/LiberationSans-Regular.ttf
--- a/resources/fonts/liberation/LiberationSerif-Bold.ttf
+++ b/resources/fonts/liberation/LiberationSerif-Bold.ttf
--- a/resources/fonts/liberation/LiberationSerif-BoldItalic.ttf
+++ b/resources/fonts/liberation/LiberationSerif-BoldItalic.ttf
--- a/resources/fonts/liberation/LiberationSerif-Italic.ttf
+++ b/resources/fonts/liberation/LiberationSerif-Italic.ttf
--- a/resources/fonts/liberation/LiberationSerif-Regular.ttf
+++ b/resources/fonts/liberation/LiberationSerif-Regular.ttf
--- a/setup/iso_639/hi.po
+++ b/setup/iso_639/hi.po
@ -10,19 +10,19 @@ msgstr ""
 "Report-Msgid-Bugs-To: Debian iso-codes team <pkg-isocodes-"
 "devel@lists.alioth.debian.org>\n"
 "POT-Creation-Date: 2011-11-25 14:01+0000\n"
-"PO-Revision-Date: 2011-09-27 16:03+0000\n"
-"Last-Translator: Kovid Goyal <Unknown>\n"
+"PO-Revision-Date: 2012-03-18 12:56+0000\n"
+"Last-Translator: Vibhav Pant <vibhavp@gmail.com>\n"
 "Language-Team: Hindi\n"
 "MIME-Version: 1.0\n"
 "Content-Type: text/plain; charset=UTF-8\n"
 "Content-Transfer-Encoding: 8bit\n"
-"X-Launchpad-Export-Date: 2011-11-26 05:19+0000\n"
-"X-Generator: Launchpad (build 14381)\n"
+"X-Launchpad-Export-Date: 2012-03-19 04:40+0000\n"
+"X-Generator: Launchpad (build 14969)\n"
 "Language: \n"

 #. name for aaa
 msgid "Ghotuo"
-msgstr ""
+msgstr "घोटुओ"

 #. name for aab
 msgid "Alumu-Tesu"
@ -30,7 +30,7 @@ msgstr ""

 #. name for aac
 msgid "Ari"
-msgstr ""
+msgstr "अरी"

 #. name for aad
 msgid "Amal"
@ -58,11 +58,11 @@ msgstr ""

 #. name for aak
 msgid "Ankave"
-msgstr ""
+msgstr "अनकावे"

 #. name for aal
 msgid "Afade"
-msgstr ""
+msgstr "अफ़ाडे"

 #. name for aam
 msgid "Aramanik"
@ -74,7 +74,7 @@ msgstr ""

 #. name for aao
 msgid "Arabic; Algerian Saharan"
-msgstr ""
+msgstr "अरबी भाषा; अल्जीरियाई सहारा"

 #. name for aap
 msgid "Arára; Pará"
@ -94,11 +94,11 @@ msgstr ""

 #. name for aat
 msgid "Albanian; Arvanitika"
-msgstr ""
+msgstr "अल्बानियन भाषा; अरवनितिका"

 #. name for aau
 msgid "Abau"
-msgstr ""
+msgstr "अबाऊ"

 #. name for aaw
 msgid "Solong"
@ -110,7 +110,7 @@ msgstr ""

 #. name for aaz
 msgid "Amarasi"
-msgstr ""
+msgstr "अमारासि"

 #. name for aba
 msgid "Abé"
@ -142,7 +142,7 @@ msgstr ""

 #. name for abh
 msgid "Arabic; Tajiki"
-msgstr ""
+msgstr "अरबी; ताजिकि"

 #. name for abi
 msgid "Abidji"
@ -150,7 +150,7 @@ msgstr ""

 #. name for abj
 msgid "Aka-Bea"
-msgstr ""
+msgstr "अका-बीआ"

 #. name for abk
 msgid "Abkhazian"
@ -166,11 +166,11 @@ msgstr ""

 #. name for abn
 msgid "Abua"
-msgstr ""
+msgstr "अबुआ"

 #. name for abo
 msgid "Abon"
-msgstr ""
+msgstr "अबोन"

 #. name for abp
 msgid "Ayta; Abellen"
@ -178,7 +178,7 @@ msgstr ""

 #. name for abq
 msgid "Abaza"
-msgstr ""
+msgstr "अबाज़ा"

 #. name for abr
 msgid "Abron"
@ -186,7 +186,7 @@ msgstr ""

 #. name for abs
 msgid "Malay; Ambonese"
-msgstr ""
+msgstr "मलय; अम्बोनीसी"

 #. name for abt
 msgid "Ambulas"
--- a/setup/iso_639/sr.po
+++ b/setup/iso_639/sr.po
--- a/src/calibre/init.py
+++ b/src/calibre/init.py
@ -381,12 +381,15 @@ def browser(honor_time=True, max_time=2, mobile_browser=False, user_agent=None):
        user_agent = USER_AGENT_MOBILE if mobile_browser else USER_AGENT
    opener.addheaders = [('User-agent', user_agent)]
    proxies = get_proxies()
+    to_add = {}
    http_proxy = proxies.get('http', None)
    if http_proxy:
-        opener.set_proxies({'http':http_proxy})
+        to_add['http'] = http_proxy
    https_proxy = proxies.get('https', None)
    if https_proxy:
-        opener.set_proxies({'https':https_proxy})
+        to_add['https'] = https_proxy
+    if to_add:
+        opener.set_proxies(to_add)

    return opener

--- a/src/calibre/constants.py
+++ b/src/calibre/constants.py
@ -4,7 +4,7 @@ __license__   = 'GPL v3'
 __copyright__ = '2008, Kovid Goyal kovid@kovidgoyal.net'
 __docformat__ = 'restructuredtext en'
 __appname__   = u'calibre'
-numeric_version = (0, 8, 43)
+numeric_version = (0, 8, 44)
 __version__   = u'.'.join(map(unicode, numeric_version))
 __author__    = u"Kovid Goyal <kovid@kovidgoyal.net>"

--- a/src/calibre/customize/builtins.py
+++ b/src/calibre/customize/builtins.py
@ -625,7 +625,8 @@ from calibre.devices.eb600.driver import (EB600, COOL_ER, SHINEBOOK,
                POCKETBOOK701, POCKETBOOK360P, PI2)
 from calibre.devices.iliad.driver import ILIAD
 from calibre.devices.irexdr.driver import IREXDR1000, IREXDR800
-from calibre.devices.jetbook.driver import JETBOOK, MIBUK, JETBOOK_MINI
+from calibre.devices.jetbook.driver import (JETBOOK, MIBUK, JETBOOK_MINI,
+        JETBOOK_COLOR)
 from calibre.devices.kindle.driver import (KINDLE, KINDLE2, KINDLE_DX,
        KINDLE_FIRE)
 from calibre.devices.nook.driver import NOOK, NOOK_COLOR
@ -664,9 +665,7 @@ plugins += [
    ILIAD,
    IREXDR1000,
    IREXDR800,
-    JETBOOK,
-    JETBOOK_MINI,
-    MIBUK,
+    JETBOOK, JETBOOK_MINI, MIBUK, JETBOOK_COLOR,
    SHINEBOOK,
    POCKETBOOK360, POCKETBOOK301, POCKETBOOK602, POCKETBOOK701, POCKETBOOK360P,
    PI2,
@ -1539,6 +1538,7 @@ class StoreWaterstonesUKStore(StoreBase):

    headquarters = 'UK'
    formats = ['EPUB', 'PDF']
+    affiliate = True

 class StoreWeightlessBooksStore(StoreBase):
    name = 'Weightless Books'
@ -1558,15 +1558,6 @@ class StoreWHSmithUKStore(StoreBase):
    headquarters = 'UK'
    formats = ['EPUB', 'PDF']

-class StoreWizardsTowerBooksStore(StoreBase):
-    name = 'Wizards Tower Books'
-    description = u'A science fiction and fantasy publisher. Concentrates mainly on making out-of-print works available once more as e-books, and helping other small presses exploit the e-book market. Also publishes a small number of limited-print-run anthologies with a view to encouraging diversity in the science fiction and fantasy field.'
-    actual_plugin = 'calibre.gui2.store.stores.wizards_tower_books_plugin:WizardsTowerBooksStore'
-
-    drm_free_only = True
-    headquarters = 'UK'
-    formats = ['EPUB', 'MOBI']
-
 class StoreWoblinkStore(StoreBase):
    name = 'Woblink'
    author = u'Tomasz Długosz'
@ -1637,7 +1628,6 @@ plugins += [
    StoreWaterstonesUKStore,
    StoreWeightlessBooksStore,
    StoreWHSmithUKStore,
-    StoreWizardsTowerBooksStore,
    StoreWoblinkStore,
    XinXiiStore,
    StoreZixoStore
--- a/src/calibre/debug.py
+++ b/src/calibre/debug.py
@ -234,7 +234,7 @@ def main(args=sys.argv):
            sql_dump = args[-1]
        reinit_db(opts.reinitialize_db, sql_dump=sql_dump)
    elif opts.inspect_mobi:
-        from calibre.ebooks.mobi.debug import inspect_mobi
+        from calibre.ebooks.mobi.debug.main import inspect_mobi
        for path in args[1:]:
            prints('Inspecting:', path)
            inspect_mobi(path)
--- a/src/calibre/devices/jetbook/driver.py
+++ b/src/calibre/devices/jetbook/driver.py
@ -125,4 +125,29 @@ class JETBOOK_MINI(USBMS):

    SUPPORTS_SUB_DIRS = True

+class JETBOOK_COLOR(USBMS):
+
+    '''
+set([(u'0x951',
+      u'0x160b',
+      u'0x0',
+      u'Freescale',
+      u'Mass Storage Device',
+      u'0802270905553')])
+    '''
+
+    FORMATS = ['epub', 'mobi', 'prc', 'fb2', 'rtf', 'txt', 'pdf', 'djvu']
+
+    gui_name = 'JetBook Color'
+    name = 'JetBook Color Device Interface'
+    description    = _('Communicate with the JetBook Color reader.')
+    author         = 'Kovid Goyal'
+
+    VENDOR_ID = [0x951]
+    PRODUCT_ID = [0x160b]
+    BCD = [0x0]
+    EBOOK_DIR_MAIN = 'My Books'
+
+    SUPPORTS_SUB_DIRS = True
+

--- a/src/calibre/devices/prs505/driver.py
+++ b/src/calibre/devices/prs505/driver.py
@ -27,7 +27,7 @@ class PRS505(USBMS):
    booklist_class = CollectionsBookList


-    FORMATS      = ['epub', 'lrf', 'lrx', 'rtf', 'pdf', 'txt']
+    FORMATS      = ['epub', 'lrf', 'lrx', 'rtf', 'pdf', 'txt', 'zbf']
    CAN_SET_METADATA = ['title', 'authors', 'collections']
    CAN_DO_DEVICE_DB_PLUGBOARD = True

--- a/src/calibre/ebooks/conversion/plugins/mobi_output.py
+++ b/src/calibre/ebooks/conversion/plugins/mobi_output.py
@ -179,7 +179,7 @@ class MOBIOutput(OutputFormatPlugin):
        writer(oeb, output_path)

        if opts.extract_to is not None:
-            from calibre.ebooks.mobi.debug import inspect_mobi
+            from calibre.ebooks.mobi.debug.main import inspect_mobi
            ddir = opts.extract_to
            inspect_mobi(output_path, ddir=ddir)

--- a/src/calibre/ebooks/mobi/debug/init.py
+++ b/src/calibre/ebooks/mobi/debug/init.py
@ -0,0 +1,16 @@
+#!/usr/bin/env python
+# vim:fileencoding=UTF-8:ts=4:sw=4:sta:et:sts=4:ai
+from __future__ import (unicode_literals, division, absolute_import,
+                        print_function)
+
+__license__   = 'GPL v3'
+__copyright__ = '2012, Kovid Goyal <kovid@kovidgoyal.net>'
+__docformat__ = 'restructuredtext en'
+
+
+def format_bytes(byts):
+    byts = bytearray(byts)
+    byts = [hex(b)[2:] for b in byts]
+    return ' '.join(byts)
+
+
--- a/src/calibre/ebooks/mobi/debug/headers.py
+++ b/src/calibre/ebooks/mobi/debug/headers.py
@ -0,0 +1,535 @@
+#!/usr/bin/env python
+# vim:fileencoding=UTF-8:ts=4:sw=4:sta:et:sts=4:ai
+from __future__ import (unicode_literals, division, absolute_import,
+                        print_function)
+
+__license__   = 'GPL v3'
+__copyright__ = '2012, Kovid Goyal <kovid@kovidgoyal.net>'
+__docformat__ = 'restructuredtext en'
+
+import struct, datetime, os
+
+from calibre.utils.date import utc_tz
+from calibre.ebooks.mobi.reader.headers import NULL_INDEX
+from calibre.ebooks.mobi.langcodes import main_language, sub_language
+from calibre.ebooks.mobi.debug import format_bytes
+from calibre.ebooks.mobi.utils import get_trailing_data
+
+# PalmDB {{{
+class PalmDOCAttributes(object):
+
+    class Attr(object):
+
+        def __init__(self, name, field, val):
+            self.name = name
+            self.val = val & field
+
+        def __str__(self):
+            return '%s: %s'%(self.name, bool(self.val))
+
+    def __init__(self, raw):
+        self.val = struct.unpack(b'<H', raw)[0]
+        self.attributes = []
+        for name, field in [('Read Only', 0x02), ('Dirty AppInfoArea', 0x04),
+                ('Backup this database', 0x08),
+                ('Okay to install newer over existing copy, if present on PalmPilot', 0x10),
+                ('Force the PalmPilot to reset after this database is installed', 0x12),
+                ('Don\'t allow copy of file to be beamed to other Pilot',
+                    0x14)]:
+            self.attributes.append(PalmDOCAttributes.Attr(name, field,
+                self.val))
+
+    def __str__(self):
+        attrs = '\n\t'.join([str(x) for x in self.attributes])
+        return 'PalmDOC Attributes: %s\n\t%s'%(bin(self.val), attrs)
+
+class PalmDB(object):
+
+    def __init__(self, raw):
+        self.raw = raw
+
+        if self.raw.startswith(b'TPZ'):
+            raise ValueError('This is a Topaz file')
+
+        self.name     = self.raw[:32].replace(b'\x00', b'')
+        self.attributes = PalmDOCAttributes(self.raw[32:34])
+        self.version = struct.unpack(b'>H', self.raw[34:36])[0]
+
+        palm_epoch = datetime.datetime(1904, 1, 1, tzinfo=utc_tz)
+        self.creation_date_raw = struct.unpack(b'>I', self.raw[36:40])[0]
+        self.creation_date = (palm_epoch +
+                datetime.timedelta(seconds=self.creation_date_raw))
+        self.modification_date_raw = struct.unpack(b'>I', self.raw[40:44])[0]
+        self.modification_date = (palm_epoch +
+                datetime.timedelta(seconds=self.modification_date_raw))
+        self.last_backup_date_raw = struct.unpack(b'>I', self.raw[44:48])[0]
+        self.last_backup_date = (palm_epoch +
+                datetime.timedelta(seconds=self.last_backup_date_raw))
+        self.modification_number = struct.unpack(b'>I', self.raw[48:52])[0]
+        self.app_info_id = self.raw[52:56]
+        self.sort_info_id = self.raw[56:60]
+        self.type = self.raw[60:64]
+        self.creator = self.raw[64:68]
+        self.ident = self.type + self.creator
+        if self.ident not in (b'BOOKMOBI', b'TEXTREAD'):
+            raise ValueError('Unknown book ident: %r'%self.ident)
+        self.last_record_uid, = struct.unpack(b'>I', self.raw[68:72])
+        self.next_rec_list_id = self.raw[72:76]
+
+        self.number_of_records, = struct.unpack(b'>H', self.raw[76:78])
+
+    def __str__(self):
+        ans = ['*'*20 + ' PalmDB Header '+ '*'*20]
+        ans.append('Name: %r'%self.name)
+        ans.append(str(self.attributes))
+        ans.append('Version: %s'%self.version)
+        ans.append('Creation date: %s (%s)'%(self.creation_date.isoformat(),
+            self.creation_date_raw))
+        ans.append('Modification date: %s (%s)'%(self.modification_date.isoformat(),
+            self.modification_date_raw))
+        ans.append('Backup date: %s (%s)'%(self.last_backup_date.isoformat(),
+            self.last_backup_date_raw))
+        ans.append('Modification number: %s'%self.modification_number)
+        ans.append('App Info ID: %r'%self.app_info_id)
+        ans.append('Sort Info ID: %r'%self.sort_info_id)
+        ans.append('Type: %r'%self.type)
+        ans.append('Creator: %r'%self.creator)
+        ans.append('Last record UID +1: %r'%self.last_record_uid)
+        ans.append('Next record list id: %r'%self.next_rec_list_id)
+        ans.append('Number of records: %s'%self.number_of_records)
+
+        return '\n'.join(ans)
+# }}}
+
+class Record(object): # {{{
+
+    def __init__(self, raw, header):
+        self.offset, self.flags, self.uid = header
+        self.raw = raw
+
+    @property
+    def header(self):
+        return 'Offset: %d Flags: %d UID: %d First 4 bytes: %r Size: %d'%(self.offset, self.flags,
+                self.uid, self.raw[:4], len(self.raw))
+# }}}
+
+# EXTH {{{
+class EXTHRecord(object):
+
+    def __init__(self, type_, data):
+        self.type = type_
+        self.data = data
+        self.name = {
+                1 : 'DRM Server id',
+                2 : 'DRM Commerce id',
+                3 : 'DRM ebookbase book id',
+                100 : 'author',
+                101 : 'publisher',
+                102 : 'imprint',
+                103 : 'description',
+                104 : 'isbn',
+                105 : 'subject',
+                106 : 'publishingdate',
+                107 : 'review',
+                108 : 'contributor',
+                109 : 'rights',
+                110 : 'subjectcode',
+                111 : 'type',
+                112 : 'source',
+                113 : 'asin',
+                114 : 'versionnumber',
+                115 : 'sample',
+                116 : 'startreading',
+                117 : 'adult',
+                118 : 'retailprice',
+                119 : 'retailpricecurrency',
+                121 : 'KF8 header section index',
+                125 : 'KF8 resources (images/fonts) count',
+                129 : 'KF8 cover URI',
+                131 : 'KF8 unknown count',
+                201 : 'coveroffset',
+                202 : 'thumboffset',
+                203 : 'hasfakecover',
+                204 : 'Creator Software',
+                205 : 'Creator Major Version', # '>I'
+                206 : 'Creator Minor Version', # '>I'
+                207 : 'Creator Build Number', # '>I'
+                208 : 'watermark',
+                209 : 'tamper_proof_keys',
+                300 : 'fontsignature',
+                301 : 'clippinglimit', # percentage '>B'
+                402 : 'publisherlimit',
+                404 : 'TTS flag', # '>B' 1 - TTS disabled 0 - TTS enabled
+                501 : 'cdetype', # 4 chars (PDOC or EBOK)
+                502 : 'lastupdatetime',
+                503 : 'updatedtitle',
+        }.get(self.type, repr(self.type))
+
+        if (self.name in {'coveroffset', 'thumboffset', 'hasfakecover',
+                'Creator Major Version', 'Creator Minor Version',
+                'Creator Build Number', 'Creator Software', 'startreading'} or
+                self.type in {121, 125, 131}):
+            self.data, = struct.unpack(b'>I', self.data)
+
+    def __str__(self):
+        return '%s (%d): %r'%(self.name, self.type, self.data)
+
+class EXTHHeader(object):
+
+    def __init__(self, raw):
+        self.raw = raw
+        if not self.raw.startswith(b'EXTH'):
+            raise ValueError('EXTH header does not start with EXTH')
+        self.length, = struct.unpack(b'>I', self.raw[4:8])
+        self.count,  = struct.unpack(b'>I', self.raw[8:12])
+
+        pos = 12
+        self.records = []
+        for i in xrange(self.count):
+            pos = self.read_record(pos)
+        self.records.sort(key=lambda x:x.type)
+        self.rmap = {x.type:x for x in self.records}
+
+    def __getitem__(self, type_):
+        return self.rmap.__getitem__(type_).data
+
+    def get(self, type_, default=None):
+        ans = self.rmap.get(type_, default)
+        return getattr(ans, 'data', default)
+
+    def read_record(self, pos):
+        type_, length = struct.unpack(b'>II', self.raw[pos:pos+8])
+        data = self.raw[(pos+8):(pos+length)]
+        self.records.append(EXTHRecord(type_, data))
+        return pos + length
+
+    @property
+    def kf8_header_index(self):
+        return self.get(121, None)
+
+    def __str__(self):
+        ans = ['*'*20 + ' EXTH Header '+ '*'*20]
+        ans.append('EXTH header length: %d'%self.length)
+        ans.append('Number of EXTH records: %d'%self.count)
+        ans.append('EXTH records...')
+        for r in self.records:
+            ans.append(str(r))
+        return '\n'.join(ans)
+# }}}
+
+class MOBIHeader(object): # {{{
+
+    def __init__(self, record0, offset):
+        self.raw = record0.raw
+        self.header_offset = offset
+
+        self.compression_raw = self.raw[:2]
+        self.compression = {1: 'No compression', 2: 'PalmDoc compression',
+                17480: 'HUFF/CDIC compression'}.get(struct.unpack(b'>H',
+                    self.compression_raw)[0],
+                    repr(self.compression_raw))
+        self.unused = self.raw[2:4]
+        self.text_length, = struct.unpack(b'>I', self.raw[4:8])
+        self.number_of_text_records, self.text_record_size = \
+                struct.unpack(b'>HH', self.raw[8:12])
+        self.encryption_type_raw, = struct.unpack(b'>H', self.raw[12:14])
+        self.encryption_type = {
+                0: 'No encryption',
+                1: 'Old mobipocket encryption',
+                2: 'Mobipocket encryption'
+            }.get(self.encryption_type_raw, repr(self.encryption_type_raw))
+        self.unknown = self.raw[14:16]
+
+        self.identifier = self.raw[16:20]
+        if self.identifier != b'MOBI':
+            raise ValueError('Identifier %r unknown'%self.identifier)
+
+        self.length, = struct.unpack(b'>I', self.raw[20:24])
+        self.type_raw, = struct.unpack(b'>I', self.raw[24:28])
+        self.type = {
+                2 : 'Mobipocket book',
+                3 : 'PalmDOC book',
+                4 : 'Audio',
+                257 : 'News',
+                258 : 'News Feed',
+                259 : 'News magazine',
+                513 : 'PICS',
+                514 : 'Word',
+                515 : 'XLS',
+                516 : 'PPT',
+                517 : 'TEXT',
+                518 : 'HTML',
+            }.get(self.type_raw, repr(self.type_raw))
+
+        self.encoding_raw, = struct.unpack(b'>I', self.raw[28:32])
+        self.encoding = {
+                1252 : 'cp1252',
+                65001: 'utf-8',
+            }.get(self.encoding_raw, repr(self.encoding_raw))
+        self.uid = self.raw[32:36]
+        self.file_version, = struct.unpack(b'>I', self.raw[36:40])
+        self.meta_orth_indx, self.meta_infl_indx = struct.unpack(
+                b'>II', self.raw[40:48])
+        self.secondary_index_record, = struct.unpack(b'>I', self.raw[48:52])
+        self.reserved = self.raw[52:80]
+        self.first_non_book_record, = struct.unpack(b'>I', self.raw[80:84])
+        self.fullname_offset, = struct.unpack(b'>I', self.raw[84:88])
+        self.fullname_length, = struct.unpack(b'>I', self.raw[88:92])
+        self.locale_raw, = struct.unpack(b'>I', self.raw[92:96])
+        langcode = self.locale_raw
+        langid    = langcode & 0xFF
+        sublangid = (langcode >> 10) & 0xFF
+        self.language = main_language.get(langid, 'ENGLISH')
+        self.sublanguage = sub_language.get(sublangid, 'NEUTRAL')
+
+        self.input_language = self.raw[96:100]
+        self.output_langauage = self.raw[100:104]
+        self.min_version, = struct.unpack(b'>I', self.raw[104:108])
+        self.first_image_index, = struct.unpack(b'>I', self.raw[108:112])
+        self.huffman_record_offset, = struct.unpack(b'>I', self.raw[112:116])
+        self.huffman_record_count, = struct.unpack(b'>I', self.raw[116:120])
+        self.datp_record_offset, = struct.unpack(b'>I', self.raw[120:124])
+        self.datp_record_count, = struct.unpack(b'>I', self.raw[124:128])
+        self.exth_flags, = struct.unpack(b'>I', self.raw[128:132])
+        self.has_exth = bool(self.exth_flags & 0x40)
+        self.has_drm_data = self.length >= 174 and len(self.raw) >= 180
+        if self.has_drm_data:
+            self.unknown3 = self.raw[132:164]
+            self.drm_offset, = struct.unpack(b'>I', self.raw[164:168])
+            self.drm_count, = struct.unpack(b'>I', self.raw[168:172])
+            self.drm_size, = struct.unpack(b'>I', self.raw[172:176])
+            self.drm_flags = bin(struct.unpack(b'>I', self.raw[176:180])[0])
+        self.has_extra_data_flags = self.length >= 232 and len(self.raw) >= 232+16
+        self.has_fcis_flis = False
+        self.has_multibytes = self.has_indexing_bytes = self.has_uncrossable_breaks = False
+        self.extra_data_flags = 0
+        if self.has_extra_data_flags:
+            self.unknown4 = self.raw[180:192]
+            self.fdst_idx, self.fdst_count = struct.unpack_from(b'>II',
+                    self.raw, 192)
+            (self.fcis_number, self.fcis_count, self.flis_number,
+                    self.flis_count) = struct.unpack(b'>IIII',
+                            self.raw[200:216])
+            self.unknown6 = self.raw[216:224]
+            self.srcs_record_index = struct.unpack(b'>I',
+                self.raw[224:228])[0]
+            self.num_srcs_records = struct.unpack(b'>I',
+                self.raw[228:232])[0]
+            self.unknown7 = self.raw[232:240]
+            self.extra_data_flags = struct.unpack(b'>I',
+                self.raw[240:244])[0]
+            self.has_multibytes = bool(self.extra_data_flags & 0b1)
+            self.has_indexing_bytes = bool(self.extra_data_flags & 0b10)
+            self.has_uncrossable_breaks = bool(self.extra_data_flags & 0b100)
+            self.primary_index_record, = struct.unpack(b'>I',
+                    self.raw[244:248])
+
+        if self.file_version >= 8:
+            (self.sect_idx, self.skel_idx, self.datp_idx, self.oth_idx
+                    ) = struct.unpack_from(b'>4L', self.raw, 248)
+            self.unknown9 = self.raw[264:self.length]
+            if self.meta_orth_indx not in {NULL_INDEX, self.sect_idx}:
+                raise ValueError('KF8 header has different Meta orth and '
+                        'section indices')
+
+        # The following are all relative to the position of the header record
+        # make them absolute for ease of debugging
+        for x in ('sect_idx', 'skel_idx', 'datp_idx', 'oth_idx',
+                'meta_orth_indx', 'huffman_record_offset',
+                'first_non_book_record', 'datp_record_offset', 'fcis_number',
+                'flis_number', 'primary_index_record', 'fdst_idx',
+                'first_image_index'):
+            if hasattr(self, x):
+                setattr(self, x, self.header_offset+getattr(self, x))
+
+        if self.has_exth:
+            self.exth_offset = 16 + self.length
+
+            self.exth = EXTHHeader(self.raw[self.exth_offset:])
+
+            self.end_of_exth = self.exth_offset + self.exth.length
+            self.bytes_after_exth = self.raw[self.end_of_exth:self.fullname_offset]
+
+    def __str__(self):
+        ans = ['*'*20 + ' MOBI %d Header '%self.file_version+ '*'*20]
+        a = ans.append
+        i = lambda d, x : a('%s (null value: %d): %d'%(d, NULL_INDEX, x))
+        ans.append('Compression: %s'%self.compression)
+        ans.append('Unused: %r'%self.unused)
+        ans.append('Number of text records: %d'%self.number_of_text_records)
+        ans.append('Text record size: %d'%self.text_record_size)
+        ans.append('Encryption: %s'%self.encryption_type)
+        ans.append('Unknown: %r'%self.unknown)
+        ans.append('Identifier: %r'%self.identifier)
+        ans.append('Header length: %d'% self.length)
+        ans.append('Type: %s'%self.type)
+        ans.append('Encoding: %s'%self.encoding)
+        ans.append('UID: %r'%self.uid)
+        ans.append('File version: %d'%self.file_version)
+        i('Meta Orth Index (Sections index in KF8)', self.meta_orth_indx)
+        i('Meta Infl Index', self.meta_infl_indx)
+        ans.append('Secondary index record: %d (null val: %d)'%(
+            self.secondary_index_record, NULL_INDEX))
+        ans.append('Reserved: %r'%self.reserved)
+        ans.append('First non-book record (null value: %d): %d'%(NULL_INDEX,
+            self.first_non_book_record))
+        ans.append('Full name offset: %d'%self.fullname_offset)
+        ans.append('Full name length: %d bytes'%self.fullname_length)
+        ans.append('Langcode: %r'%self.locale_raw)
+        ans.append('Language: %s'%self.language)
+        ans.append('Sub language: %s'%self.sublanguage)
+        ans.append('Input language: %r'%self.input_language)
+        ans.append('Output language: %r'%self.output_langauage)
+        ans.append('Min version: %d'%self.min_version)
+        ans.append('First Image index: %d'%self.first_image_index)
+        ans.append('Huffman record offset: %d'%self.huffman_record_offset)
+        ans.append('Huffman record count: %d'%self.huffman_record_count)
+        ans.append('DATP record offset: %r'%self.datp_record_offset)
+        ans.append('DATP record count: %r'%self.datp_record_count)
+        ans.append('EXTH flags: %s (%s)'%(bin(self.exth_flags)[2:], self.has_exth))
+        if self.has_drm_data:
+            ans.append('Unknown3: %r'%self.unknown3)
+            ans.append('DRM Offset: %s'%self.drm_offset)
+            ans.append('DRM Count: %s'%self.drm_count)
+            ans.append('DRM Size: %s'%self.drm_size)
+            ans.append('DRM Flags: %r'%self.drm_flags)
+        if self.has_extra_data_flags:
+            ans.append('Unknown4: %r'%self.unknown4)
+            ans.append('FDST Index: %d'% self.fdst_idx)
+            ans.append('FDST Count: %d'% self.fdst_count)
+            ans.append('FCIS number: %d'% self.fcis_number)
+            ans.append('FCIS count: %d'% self.fcis_count)
+            ans.append('FLIS number: %d'% self.flis_number)
+            ans.append('FLIS count: %d'% self.flis_count)
+            ans.append('Unknown6: %r'% self.unknown6)
+            ans.append('SRCS record index: %d'%self.srcs_record_index)
+            ans.append('Number of SRCS records?: %d'%self.num_srcs_records)
+            ans.append('Unknown7: %r'%self.unknown7)
+            ans.append(('Extra data flags: %s (has multibyte: %s) '
+                '(has indexing: %s) (has uncrossable breaks: %s)')%(
+                    bin(self.extra_data_flags), self.has_multibytes,
+                    self.has_indexing_bytes, self.has_uncrossable_breaks ))
+            ans.append('Primary index record (null value: %d): %d'%(NULL_INDEX,
+                self.primary_index_record))
+        if self.file_version >= 8:
+            i('Sections Index', self.sect_idx)
+            i('SKEL Index', self.skel_idx)
+            i('DATP Index', self.datp_idx)
+            i('Other Index', self.oth_idx)
+            if self.unknown9:
+                a('Unknown9: %r'%self.unknown9)
+
+        ans = '\n'.join(ans)
+
+        if self.has_exth:
+            ans += '\n\n' + str(self.exth)
+            ans += '\n\nBytes after EXTH (%d bytes): %s'%(
+                    len(self.bytes_after_exth),
+                    format_bytes(self.bytes_after_exth))
+
+        ans += '\nNumber of bytes after full name: %d' % (len(self.raw) - (self.fullname_offset +
+                self.fullname_length))
+
+        ans += '\nRecord 0 length: %d'%len(self.raw)
+        return ans
+# }}}
+
+class MOBIFile(object):
+
+    def __init__(self, stream):
+        self.raw = stream.read()
+        self.palmdb = PalmDB(self.raw[:78])
+
+        self.record_headers = []
+        self.records = []
+        for i in xrange(self.palmdb.number_of_records):
+            pos = 78 + i * 8
+            offset, a1, a2, a3, a4 = struct.unpack(b'>LBBBB', self.raw[pos:pos+8])
+            flags, val = a1, a2 << 16 | a3 << 8 | a4
+            self.record_headers.append((offset, flags, val))
+
+        def section(section_number):
+            if section_number == self.palmdb.number_of_records - 1:
+                end_off = len(self.raw)
+            else:
+                end_off = self.record_headers[section_number + 1][0]
+            off = self.record_headers[section_number][0]
+            return self.raw[off:end_off]
+
+        for i in range(self.palmdb.number_of_records):
+            self.records.append(Record(section(i), self.record_headers[i]))
+
+        self.mobi_header = MOBIHeader(self.records[0], 0)
+        self.huffman_record_nums = []
+
+        self.kf8_type = None
+        mh = mh8 = self.mobi_header
+        if mh.file_version >= 8:
+            self.kf8_type = 'standalone'
+        elif mh.has_exth and mh.exth.kf8_header_index is not None:
+            self.kf8_type = 'joint'
+            kf8i = mh.exth.kf8_header_index
+            mh8 = MOBIHeader(self.records[kf8i], kf8i)
+        self.mobi8_header = mh8
+
+        if 'huff' in self.mobi_header.compression.lower():
+            from calibre.ebooks.mobi.huffcdic import HuffReader
+
+            def huffit(off, cnt):
+                huffman_record_nums = list(xrange(off, off+cnt))
+                huffrecs = [self.records[r].raw for r in huffman_record_nums]
+                huffs = HuffReader(huffrecs)
+                return huffman_record_nums, huffs.unpack
+
+            if self.kf8_type == 'joint':
+                recs6, d6 = huffit(mh.huffman_record_offset,
+                        mh.huffman_record_count)
+                recs8, d8 = huffit(mh8.huffman_record_offset,
+                        mh8.huffman_record_count)
+                self.huffman_record_nums = recs6 + recs8
+            else:
+                self.huffman_record_nums, d6 = huffit(mh.huffman_record_offset,
+                        mh.huffman_record_count)
+                d8 = d6
+        elif 'palmdoc' in self.mobi_header.compression.lower():
+            from calibre.ebooks.compression.palmdoc import decompress_doc
+            d8 = d6 = decompress_doc
+        else:
+            d8 = d6 = lambda x: x
+
+        self.decompress6, self.decompress8 = d6, d8
+
+class TextRecord(object): # {{{
+
+    def __init__(self, idx, record, extra_data_flags, decompress):
+        self.trailing_data, self.raw = get_trailing_data(record.raw, extra_data_flags)
+        raw_trailing_bytes = record.raw[len(self.raw):]
+        self.raw = decompress(self.raw)
+
+        if 0 in self.trailing_data:
+            self.trailing_data['multibyte_overlap'] = self.trailing_data.pop(0)
+        if 1 in self.trailing_data:
+            self.trailing_data['indexing'] = self.trailing_data.pop(1)
+        if 2 in self.trailing_data:
+            self.trailing_data['uncrossable_breaks'] = self.trailing_data.pop(2)
+        self.trailing_data['raw_bytes'] = raw_trailing_bytes
+
+        for typ, val in self.trailing_data.iteritems():
+            if isinstance(typ, int):
+                print ('Record %d has unknown trailing data of type: %d : %r'%
+                        (idx, typ, val))
+
+        self.idx = idx
+
+    def dump(self, folder):
+        name = '%06d'%self.idx
+        with open(os.path.join(folder, name+'.txt'), 'wb') as f:
+            f.write(self.raw)
+        with open(os.path.join(folder, name+'.trailing_data'), 'wb') as f:
+            for k, v in self.trailing_data.iteritems():
+                raw = '%s : %r\n\n'%(k, v)
+                f.write(raw.encode('utf-8'))
+
+# }}}
+
+
--- a/src/calibre/ebooks/mobi/debug/main.py
+++ b/src/calibre/ebooks/mobi/debug/main.py
@ -0,0 +1,48 @@
+#!/usr/bin/env python
+# vim:fileencoding=UTF-8:ts=4:sw=4:sta:et:sts=4:ai
+from __future__ import (unicode_literals, division, absolute_import,
+                        print_function)
+
+__license__   = 'GPL v3'
+__copyright__ = '2012, Kovid Goyal <kovid@kovidgoyal.net>'
+__docformat__ = 'restructuredtext en'
+
+import sys, os, shutil
+
+from calibre.ebooks.mobi.debug.headers import MOBIFile
+from calibre.ebooks.mobi.debug.mobi6 import inspect_mobi as inspect_mobi6
+from calibre.ebooks.mobi.debug.mobi8 import inspect_mobi as inspect_mobi8
+
+def inspect_mobi(path_or_stream, ddir=None): # {{{
+    stream = (path_or_stream if hasattr(path_or_stream, 'read') else
+            open(path_or_stream, 'rb'))
+    f = MOBIFile(stream)
+    if ddir is None:
+        ddir = 'decompiled_' + os.path.splitext(os.path.basename(stream.name))[0]
+    try:
+        shutil.rmtree(ddir)
+    except:
+        pass
+    os.makedirs(ddir)
+    if f.kf8_type is None:
+        inspect_mobi6(f, ddir)
+    elif f.kf8_type == 'joint':
+        p6 = os.path.join(ddir, 'mobi6')
+        os.mkdir(p6)
+        inspect_mobi6(f, p6)
+        p8 = os.path.join(ddir, 'mobi8')
+        os.mkdir(p8)
+        inspect_mobi8(f, p8)
+    else:
+        inspect_mobi8(f, ddir)
+
+    print ('Debug data saved to:', ddir)
+
+# }}}
+
+def main():
+    inspect_mobi(sys.argv[1])
+
+if __name__ == '__main__':
+    main()
+
--- a/src/calibre/ebooks/mobi/debug/mobi6.py
+++ b/src/calibre/ebooks/mobi/debug/mobi6.py
@ -7,403 +7,20 @@ __license__   = 'GPL v3'
 __copyright__ = '2011, Kovid Goyal <kovid@kovidgoyal.net>'
 __docformat__ = 'restructuredtext en'

-import struct, datetime, sys, os, shutil
+import struct, sys, os
 from collections import OrderedDict, defaultdict

 from lxml import html

-from calibre.utils.date import utc_tz
-from calibre.ebooks.mobi.langcodes import main_language, sub_language
 from calibre.ebooks.mobi.reader.headers import NULL_INDEX
 from calibre.ebooks.mobi.reader.index import (parse_index_record,
        parse_tagx_section)
 from calibre.ebooks.mobi.utils import (decode_hex_number, decint,
-        get_trailing_data, decode_tbs, read_font_record)
+        decode_tbs, read_font_record)
 from calibre.utils.magick.draw import identify_data
+from calibre.ebooks.mobi.debug import format_bytes
+from calibre.ebooks.mobi.debug.headers import TextRecord

-def format_bytes(byts):
-    byts = bytearray(byts)
-    byts = [hex(b)[2:] for b in byts]
-    return ' '.join(byts)
-
-# PalmDB {{{
-class PalmDOCAttributes(object):
-
-    class Attr(object):
-
-        def __init__(self, name, field, val):
-            self.name = name
-            self.val = val & field
-
-        def __str__(self):
-            return '%s: %s'%(self.name, bool(self.val))
-
-    def __init__(self, raw):
-        self.val = struct.unpack(b'<H', raw)[0]
-        self.attributes = []
-        for name, field in [('Read Only', 0x02), ('Dirty AppInfoArea', 0x04),
-                ('Backup this database', 0x08),
-                ('Okay to install newer over existing copy, if present on PalmPilot', 0x10),
-                ('Force the PalmPilot to reset after this database is installed', 0x12),
-                ('Don\'t allow copy of file to be beamed to other Pilot',
-                    0x14)]:
-            self.attributes.append(PalmDOCAttributes.Attr(name, field,
-                self.val))
-
-    def __str__(self):
-        attrs = '\n\t'.join([str(x) for x in self.attributes])
-        return 'PalmDOC Attributes: %s\n\t%s'%(bin(self.val), attrs)
-
-class PalmDB(object):
-
-    def __init__(self, raw):
-        self.raw = raw
-
-        if self.raw.startswith(b'TPZ'):
-            raise ValueError('This is a Topaz file')
-
-        self.name     = self.raw[:32].replace(b'\x00', b'')
-        self.attributes = PalmDOCAttributes(self.raw[32:34])
-        self.version = struct.unpack(b'>H', self.raw[34:36])[0]
-
-        palm_epoch = datetime.datetime(1904, 1, 1, tzinfo=utc_tz)
-        self.creation_date_raw = struct.unpack(b'>I', self.raw[36:40])[0]
-        self.creation_date = (palm_epoch +
-                datetime.timedelta(seconds=self.creation_date_raw))
-        self.modification_date_raw = struct.unpack(b'>I', self.raw[40:44])[0]
-        self.modification_date = (palm_epoch +
-                datetime.timedelta(seconds=self.modification_date_raw))
-        self.last_backup_date_raw = struct.unpack(b'>I', self.raw[44:48])[0]
-        self.last_backup_date = (palm_epoch +
-                datetime.timedelta(seconds=self.last_backup_date_raw))
-        self.modification_number = struct.unpack(b'>I', self.raw[48:52])[0]
-        self.app_info_id = self.raw[52:56]
-        self.sort_info_id = self.raw[56:60]
-        self.type = self.raw[60:64]
-        self.creator = self.raw[64:68]
-        self.ident = self.type + self.creator
-        if self.ident not in (b'BOOKMOBI', b'TEXTREAD'):
-            raise ValueError('Unknown book ident: %r'%self.ident)
-        self.last_record_uid, = struct.unpack(b'>I', self.raw[68:72])
-        self.next_rec_list_id = self.raw[72:76]
-
-        self.number_of_records, = struct.unpack(b'>H', self.raw[76:78])
-
-    def __str__(self):
-        ans = ['*'*20 + ' PalmDB Header '+ '*'*20]
-        ans.append('Name: %r'%self.name)
-        ans.append(str(self.attributes))
-        ans.append('Version: %s'%self.version)
-        ans.append('Creation date: %s (%s)'%(self.creation_date.isoformat(),
-            self.creation_date_raw))
-        ans.append('Modification date: %s (%s)'%(self.modification_date.isoformat(),
-            self.modification_date_raw))
-        ans.append('Backup date: %s (%s)'%(self.last_backup_date.isoformat(),
-            self.last_backup_date_raw))
-        ans.append('Modification number: %s'%self.modification_number)
-        ans.append('App Info ID: %r'%self.app_info_id)
-        ans.append('Sort Info ID: %r'%self.sort_info_id)
-        ans.append('Type: %r'%self.type)
-        ans.append('Creator: %r'%self.creator)
-        ans.append('Last record UID +1: %r'%self.last_record_uid)
-        ans.append('Next record list id: %r'%self.next_rec_list_id)
-        ans.append('Number of records: %s'%self.number_of_records)
-
-        return '\n'.join(ans)
-# }}}
-
-class Record(object): # {{{
-
-    def __init__(self, raw, header):
-        self.offset, self.flags, self.uid = header
-        self.raw = raw
-
-    @property
-    def header(self):
-        return 'Offset: %d Flags: %d UID: %d First 4 bytes: %r Size: %d'%(self.offset, self.flags,
-                self.uid, self.raw[:4], len(self.raw))
-# }}}
-
-# EXTH {{{
-class EXTHRecord(object):
-
-    def __init__(self, type_, data):
-        self.type = type_
-        self.data = data
-        self.name = {
-                1 : 'DRM Server id',
-                2 : 'DRM Commerce id',
-                3 : 'DRM ebookbase book id',
-                100 : 'author',
-                101 : 'publisher',
-                102 : 'imprint',
-                103 : 'description',
-                104 : 'isbn',
-                105 : 'subject',
-                106 : 'publishingdate',
-                107 : 'review',
-                108 : 'contributor',
-                109 : 'rights',
-                110 : 'subjectcode',
-                111 : 'type',
-                112 : 'source',
-                113 : 'asin',
-                114 : 'versionnumber',
-                115 : 'sample',
-                116 : 'startreading',
-                117 : 'adult',
-                118 : 'retailprice',
-                119 : 'retailpricecurrency',
-                121 : 'KF8 header section index',
-                125 : 'KF8 resources (images/fonts) count',
-                129 : 'KF8 cover URI',
-                131 : 'KF8 unknown count',
-                201 : 'coveroffset',
-                202 : 'thumboffset',
-                203 : 'hasfakecover',
-                204 : 'Creator Software',
-                205 : 'Creator Major Version', # '>I'
-                206 : 'Creator Minor Version', # '>I'
-                207 : 'Creator Build Number', # '>I'
-                208 : 'watermark',
-                209 : 'tamper_proof_keys',
-                300 : 'fontsignature',
-                301 : 'clippinglimit', # percentage '>B'
-                402 : 'publisherlimit',
-                404 : 'TTS flag', # '>B' 1 - TTS disabled 0 - TTS enabled
-                501 : 'cdetype', # 4 chars (PDOC or EBOK)
-                502 : 'lastupdatetime',
-                503 : 'updatedtitle',
-        }.get(self.type, repr(self.type))
-
-        if (self.name in {'coveroffset', 'thumboffset', 'hasfakecover',
-                'Creator Major Version', 'Creator Minor Version',
-                'Creator Build Number', 'Creator Software', 'startreading'} or
-                self.type in {121, 125, 131}):
-            self.data, = struct.unpack(b'>I', self.data)
-
-    def __str__(self):
-        return '%s (%d): %r'%(self.name, self.type, self.data)
-
-class EXTHHeader(object):
-
-    def __init__(self, raw):
-        self.raw = raw
-        if not self.raw.startswith(b'EXTH'):
-            raise ValueError('EXTH header does not start with EXTH')
-        self.length, = struct.unpack(b'>I', self.raw[4:8])
-        self.count,  = struct.unpack(b'>I', self.raw[8:12])
-
-        pos = 12
-        self.records = []
-        for i in xrange(self.count):
-            pos = self.read_record(pos)
-        self.records.sort(key=lambda x:x.type)
-
-    def read_record(self, pos):
-        type_, length = struct.unpack(b'>II', self.raw[pos:pos+8])
-        data = self.raw[(pos+8):(pos+length)]
-        self.records.append(EXTHRecord(type_, data))
-        return pos + length
-
-    def __str__(self):
-        ans = ['*'*20 + ' EXTH Header '+ '*'*20]
-        ans.append('EXTH header length: %d'%self.length)
-        ans.append('Number of EXTH records: %d'%self.count)
-        ans.append('EXTH records...')
-        for r in self.records:
-            ans.append(str(r))
-        return '\n'.join(ans)
-# }}}
-
-class MOBIHeader(object): # {{{
-
-    def __init__(self, record0):
-        self.raw = record0.raw
-
-        self.compression_raw = self.raw[:2]
-        self.compression = {1: 'No compression', 2: 'PalmDoc compression',
-                17480: 'HUFF/CDIC compression'}.get(struct.unpack(b'>H',
-                    self.compression_raw)[0],
-                    repr(self.compression_raw))
-        self.unused = self.raw[2:4]
-        self.text_length, = struct.unpack(b'>I', self.raw[4:8])
-        self.number_of_text_records, self.text_record_size = \
-                struct.unpack(b'>HH', self.raw[8:12])
-        self.encryption_type_raw, = struct.unpack(b'>H', self.raw[12:14])
-        self.encryption_type = {
-                0: 'No encryption',
-                1: 'Old mobipocket encryption',
-                2: 'Mobipocket encryption'
-            }.get(self.encryption_type_raw, repr(self.encryption_type_raw))
-        self.unknown = self.raw[14:16]
-
-        self.identifier = self.raw[16:20]
-        if self.identifier != b'MOBI':
-            raise ValueError('Identifier %r unknown'%self.identifier)
-
-        self.length, = struct.unpack(b'>I', self.raw[20:24])
-        self.type_raw, = struct.unpack(b'>I', self.raw[24:28])
-        self.type = {
-                2 : 'Mobipocket book',
-                3 : 'PalmDOC book',
-                4 : 'Audio',
-                257 : 'News',
-                258 : 'News Feed',
-                259 : 'News magazine',
-                513 : 'PICS',
-                514 : 'Word',
-                515 : 'XLS',
-                516 : 'PPT',
-                517 : 'TEXT',
-                518 : 'HTML',
-            }.get(self.type_raw, repr(self.type_raw))
-
-        self.encoding_raw, = struct.unpack(b'>I', self.raw[28:32])
-        self.encoding = {
-                1252 : 'cp1252',
-                65001: 'utf-8',
-            }.get(self.encoding_raw, repr(self.encoding_raw))
-        self.uid = self.raw[32:36]
-        self.file_version = struct.unpack(b'>I', self.raw[36:40])
-        self.reserved = self.raw[40:48]
-        self.secondary_index_record, = struct.unpack(b'>I', self.raw[48:52])
-        self.reserved2 = self.raw[52:80]
-        self.first_non_book_record, = struct.unpack(b'>I', self.raw[80:84])
-        self.fullname_offset, = struct.unpack(b'>I', self.raw[84:88])
-        self.fullname_length, = struct.unpack(b'>I', self.raw[88:92])
-        self.locale_raw, = struct.unpack(b'>I', self.raw[92:96])
-        langcode = self.locale_raw
-        langid    = langcode & 0xFF
-        sublangid = (langcode >> 10) & 0xFF
-        self.language = main_language.get(langid, 'ENGLISH')
-        self.sublanguage = sub_language.get(sublangid, 'NEUTRAL')
-
-        self.input_language = self.raw[96:100]
-        self.output_langauage = self.raw[100:104]
-        self.min_version, = struct.unpack(b'>I', self.raw[104:108])
-        self.first_image_index, = struct.unpack(b'>I', self.raw[108:112])
-        self.huffman_record_offset, = struct.unpack(b'>I', self.raw[112:116])
-        self.huffman_record_count, = struct.unpack(b'>I', self.raw[116:120])
-        self.datp_record_offset, = struct.unpack(b'>I', self.raw[120:124])
-        self.datp_record_count, = struct.unpack(b'>I', self.raw[124:128])
-        self.exth_flags, = struct.unpack(b'>I', self.raw[128:132])
-        self.has_exth = bool(self.exth_flags & 0x40)
-        self.has_drm_data = self.length >= 174 and len(self.raw) >= 180
-        if self.has_drm_data:
-            self.unknown3 = self.raw[132:164]
-            self.drm_offset, = struct.unpack(b'>I', self.raw[164:168])
-            self.drm_count, = struct.unpack(b'>I', self.raw[168:172])
-            self.drm_size, = struct.unpack(b'>I', self.raw[172:176])
-            self.drm_flags = bin(struct.unpack(b'>I', self.raw[176:180])[0])
-        self.has_extra_data_flags = self.length >= 232 and len(self.raw) >= 232+16
-        self.has_fcis_flis = False
-        self.has_multibytes = self.has_indexing_bytes = self.has_uncrossable_breaks = False
-        self.extra_data_flags = 0
-        if self.has_extra_data_flags:
-            self.unknown4 = self.raw[180:192]
-            self.first_content_record, self.last_content_record = \
-                    struct.unpack(b'>HH', self.raw[192:196])
-            self.unknown5, = struct.unpack(b'>I', self.raw[196:200])
-            (self.fcis_number, self.fcis_count, self.flis_number,
-                    self.flis_count) = struct.unpack(b'>IIII',
-                            self.raw[200:216])
-            self.unknown6 = self.raw[216:224]
-            self.srcs_record_index = struct.unpack(b'>I',
-                self.raw[224:228])[0]
-            self.num_srcs_records = struct.unpack(b'>I',
-                self.raw[228:232])[0]
-            self.unknown7 = self.raw[232:240]
-            self.extra_data_flags = struct.unpack(b'>I',
-                self.raw[240:244])[0]
-            self.has_multibytes = bool(self.extra_data_flags & 0b1)
-            self.has_indexing_bytes = bool(self.extra_data_flags & 0b10)
-            self.has_uncrossable_breaks = bool(self.extra_data_flags & 0b100)
-            self.primary_index_record, = struct.unpack(b'>I',
-                    self.raw[244:248])
-
-        if self.has_exth:
-            self.exth_offset = 16 + self.length
-
-            self.exth = EXTHHeader(self.raw[self.exth_offset:])
-
-            self.end_of_exth = self.exth_offset + self.exth.length
-            self.bytes_after_exth = self.raw[self.end_of_exth:self.fullname_offset]
-
-    def __str__(self):
-        ans = ['*'*20 + ' MOBI Header '+ '*'*20]
-        ans.append('Compression: %s'%self.compression)
-        ans.append('Unused: %r'%self.unused)
-        ans.append('Number of text records: %d'%self.number_of_text_records)
-        ans.append('Text record size: %d'%self.text_record_size)
-        ans.append('Encryption: %s'%self.encryption_type)
-        ans.append('Unknown: %r'%self.unknown)
-        ans.append('Identifier: %r'%self.identifier)
-        ans.append('Header length: %d'% self.length)
-        ans.append('Type: %s'%self.type)
-        ans.append('Encoding: %s'%self.encoding)
-        ans.append('UID: %r'%self.uid)
-        ans.append('File version: %d'%self.file_version)
-        ans.append('Reserved: %r'%self.reserved)
-        ans.append('Secondary index record: %d (null val: %d)'%(
-            self.secondary_index_record, NULL_INDEX))
-        ans.append('Reserved2: %r'%self.reserved2)
-        ans.append('First non-book record (null value: %d): %d'%(NULL_INDEX,
-            self.first_non_book_record))
-        ans.append('Full name offset: %d'%self.fullname_offset)
-        ans.append('Full name length: %d bytes'%self.fullname_length)
-        ans.append('Langcode: %r'%self.locale_raw)
-        ans.append('Language: %s'%self.language)
-        ans.append('Sub language: %s'%self.sublanguage)
-        ans.append('Input language: %r'%self.input_language)
-        ans.append('Output language: %r'%self.output_langauage)
-        ans.append('Min version: %d'%self.min_version)
-        ans.append('First Image index: %d'%self.first_image_index)
-        ans.append('Huffman record offset: %d'%self.huffman_record_offset)
-        ans.append('Huffman record count: %d'%self.huffman_record_count)
-        ans.append('DATP record offset: %r'%self.datp_record_offset)
-        ans.append('DATP record count: %r'%self.datp_record_count)
-        ans.append('EXTH flags: %s (%s)'%(bin(self.exth_flags)[2:], self.has_exth))
-        if self.has_drm_data:
-            ans.append('Unknown3: %r'%self.unknown3)
-            ans.append('DRM Offset: %s'%self.drm_offset)
-            ans.append('DRM Count: %s'%self.drm_count)
-            ans.append('DRM Size: %s'%self.drm_size)
-            ans.append('DRM Flags: %r'%self.drm_flags)
-        if self.has_extra_data_flags:
-            ans.append('Unknown4: %r'%self.unknown4)
-            ans.append('First content record: %d'% self.first_content_record)
-            ans.append('Last content record: %d'% self.last_content_record)
-            ans.append('Unknown5: %d'% self.unknown5)
-            ans.append('FCIS number: %d'% self.fcis_number)
-            ans.append('FCIS count: %d'% self.fcis_count)
-            ans.append('FLIS number: %d'% self.flis_number)
-            ans.append('FLIS count: %d'% self.flis_count)
-            ans.append('Unknown6: %r'% self.unknown6)
-            ans.append('SRCS record index: %d'%self.srcs_record_index)
-            ans.append('Number of SRCS records?: %d'%self.num_srcs_records)
-            ans.append('Unknown7: %r'%self.unknown7)
-            ans.append(('Extra data flags: %s (has multibyte: %s) '
-                '(has indexing: %s) (has uncrossable breaks: %s)')%(
-                    bin(self.extra_data_flags), self.has_multibytes,
-                    self.has_indexing_bytes, self.has_uncrossable_breaks ))
-            ans.append('Primary index record (null value: %d): %d'%(NULL_INDEX,
-                self.primary_index_record))
-
-        ans = '\n'.join(ans)
-
-        if self.has_exth:
-            ans += '\n\n' + str(self.exth)
-            ans += '\n\nBytes after EXTH (%d bytes): %s'%(
-                    len(self.bytes_after_exth),
-                    format_bytes(self.bytes_after_exth))
-
-        ans += '\nNumber of bytes after full name: %d' % (len(self.raw) - (self.fullname_offset +
-                self.fullname_length))
-
-        ans += '\nRecord 0 length: %d'%len(self.raw)
-        return ans
-# }}}

 class TagX(object): # {{{

@ -856,39 +473,6 @@ class CNCX(object): # {{{

 # }}}

-class TextRecord(object): # {{{
-
-    def __init__(self, idx, record, extra_data_flags, decompress):
-        self.trailing_data, self.raw = get_trailing_data(record.raw, extra_data_flags)
-        raw_trailing_bytes = record.raw[len(self.raw):]
-        self.raw = decompress(self.raw)
-
-        if 0 in self.trailing_data:
-            self.trailing_data['multibyte_overlap'] = self.trailing_data.pop(0)
-        if 1 in self.trailing_data:
-            self.trailing_data['indexing'] = self.trailing_data.pop(1)
-        if 2 in self.trailing_data:
-            self.trailing_data['uncrossable_breaks'] = self.trailing_data.pop(2)
-        self.trailing_data['raw_bytes'] = raw_trailing_bytes
-
-        for typ, val in self.trailing_data.iteritems():
-            if isinstance(typ, int):
-                print ('Record %d has unknown trailing data of type: %d : %r'%
-                        (idx, typ, val))
-
-        self.idx = idx
-
-    def dump(self, folder):
-        name = '%06d'%self.idx
-        with open(os.path.join(folder, name+'.txt'), 'wb') as f:
-            f.write(self.raw)
-        with open(os.path.join(folder, name+'.trailing_data'), 'wb') as f:
-            for k, v in self.trailing_data.iteritems():
-                raw = '%s : %r\n\n'%(k, v)
-                f.write(raw.encode('utf-8'))
-
-# }}}
-
 class ImageRecord(object): # {{{

    def __init__(self, idx, record, fmt):
@ -1130,46 +714,10 @@ class TBSIndexing(object): # {{{

 class MOBIFile(object): # {{{

-    def __init__(self, stream):
-        self.raw = stream.read()
-
-        self.palmdb = PalmDB(self.raw[:78])
-
-        self.record_headers = []
-        self.records = []
-        for i in xrange(self.palmdb.number_of_records):
-            pos = 78 + i * 8
-            offset, a1, a2, a3, a4 = struct.unpack(b'>LBBBB', self.raw[pos:pos+8])
-            flags, val = a1, a2 << 16 | a3 << 8 | a4
-            self.record_headers.append((offset, flags, val))
-
-        def section(section_number):
-            if section_number == self.palmdb.number_of_records - 1:
-                end_off = len(self.raw)
-            else:
-                end_off = self.record_headers[section_number + 1][0]
-            off = self.record_headers[section_number][0]
-            return self.raw[off:end_off]
-
-        for i in range(self.palmdb.number_of_records):
-            self.records.append(Record(section(i), self.record_headers[i]))
-
-        self.mobi_header = MOBIHeader(self.records[0])
-        self.huffman_record_nums = []
-
-        if 'huff' in self.mobi_header.compression.lower():
-            self.huffman_record_nums = list(xrange(self.mobi_header.huffman_record_offset,
-                        self.mobi_header.huffman_record_offset +
-                        self.mobi_header.huffman_record_count))
-            huffrecs = [self.records[r].raw for r in self.huffman_record_nums]
-            from calibre.ebooks.mobi.huffcdic import HuffReader
-            huffs = HuffReader(huffrecs)
-            decompress = huffs.unpack
-        elif 'palmdoc' in self.mobi_header.compression.lower():
-            from calibre.ebooks.compression.palmdoc import decompress_doc
-            decompress = decompress_doc
-        else:
-            decompress = lambda x: x
+    def __init__(self, mf):
+        for x in ('raw', 'palmdb', 'record_headers', 'records', 'mobi_header',
+                'huffman_record_nums',):
+            setattr(self, x, getattr(mf, x))

        self.index_header = self.index_record = None
        self.indexing_record_nums = set()
@ -1201,7 +749,7 @@ class MOBIFile(object): # {{{
        if fntbr == NULL_INDEX:
            fntbr = len(self.records)
        self.text_records = [TextRecord(r, self.records[r],
-            self.mobi_header.extra_data_flags, decompress) for r in xrange(1,
+            self.mobi_header.extra_data_flags, mf.decompress6) for r in xrange(1,
            min(len(self.records), ntr+1))]
        self.image_records, self.binary_records = [], []
        self.font_records = []
@ -1241,17 +789,8 @@ class MOBIFile(object): # {{{
        print (str(self.mobi_header).encode('utf-8'), file=f)
 # }}}

-def inspect_mobi(path_or_stream, ddir=None): # {{{
-    stream = (path_or_stream if hasattr(path_or_stream, 'read') else
-            open(path_or_stream, 'rb'))
-    f = MOBIFile(stream)
-    if ddir is None:
-        ddir = 'decompiled_' + os.path.splitext(os.path.basename(stream.name))[0]
-    try:
-        shutil.rmtree(ddir)
-    except:
-        pass
-    os.makedirs(ddir)
+def inspect_mobi(mobi_file, ddir):
+    f = MOBIFile(mobi_file)
    with open(os.path.join(ddir, 'header.txt'), 'wb') as out:
        f.print_header(f=out)

@ -1262,12 +801,11 @@ def inspect_mobi(path_or_stream, ddir=None): # {{{
            of.write(rec.raw)
            alltext += rec.raw
        of.seek(0)
-    if f.mobi_header.file_version < 8:
-        root = html.fromstring(alltext.decode('utf-8'))
-        with open(os.path.join(ddir, 'pretty.html'), 'wb') as of:
-            of.write(html.tostring(root, pretty_print=True, encoding='utf-8',
-                include_meta_content_type=True))

+    root = html.fromstring(alltext.decode('utf-8'))
+    with open(os.path.join(ddir, 'pretty.html'), 'wb') as of:
+        of.write(html.tostring(root, pretty_print=True, encoding='utf-8',
+            include_meta_content_type=True))

    if f.index_header is not None:
        f.index_record.alltext = alltext
@ -1295,13 +833,7 @@ def inspect_mobi(path_or_stream, ddir=None): # {{{
            rec.dump(tdir)


-    print ('Debug data saved to:', ddir)

 # }}}

-def main():
-    inspect_mobi(sys.argv[1])
-
-if __name__ == '__main__':
-    main()

--- a/src/calibre/ebooks/mobi/debug/mobi8.py
+++ b/src/calibre/ebooks/mobi/debug/mobi8.py
@ -0,0 +1,62 @@
+#!/usr/bin/env python
+# vim:fileencoding=UTF-8:ts=4:sw=4:sta:et:sts=4:ai
+from __future__ import (unicode_literals, division, absolute_import,
+                        print_function)
+
+__license__   = 'GPL v3'
+__copyright__ = '2012, Kovid Goyal <kovid@kovidgoyal.net>'
+__docformat__ = 'restructuredtext en'
+
+import sys, os
+
+from calibre.ebooks.mobi.debug.headers import TextRecord
+
+class MOBIFile(object):
+
+    def __init__(self, mf):
+        self.mf = mf
+        h, h8 = mf.mobi_header, mf.mobi8_header
+        first_text_record = 1
+        offset = 0
+        res_end = len(mf.records)
+        if mf.kf8_type == 'joint':
+            offset = h.exth.kf8_header_index
+            res_end = offset - 1
+
+        self.resource_records = mf.records[h.first_non_book_record:res_end]
+        self.text_records = [TextRecord(i, r, h8.extra_data_flags,
+            mf.decompress8) for i, r in
+            enumerate(mf.records[first_text_record+offset:
+                first_text_record+offset+h8.number_of_text_records])]
+
+        self.raw_text = b''.join(r.raw for r in self.text_records)
+
+    def print_header(self, f=sys.stdout):
+        print (str(self.mf.palmdb).encode('utf-8'), file=f)
+        print (file=f)
+        print ('Record headers:', file=f)
+        for i, r in enumerate(self.mf.records):
+            print ('%6d. %s'%(i, r.header), file=f)
+
+        print (file=f)
+        print (str(self.mf.mobi8_header).encode('utf-8'), file=f)
+
+
+def inspect_mobi(mobi_file, ddir):
+    f = MOBIFile(mobi_file)
+    with open(os.path.join(ddir, 'header.txt'), 'wb') as out:
+        f.print_header(f=out)
+
+    alltext = os.path.join(ddir, 'raw_text.html')
+    with open(alltext, 'wb') as of:
+        of.write(f.raw_text)
+
+    for tdir, attr in [('text_records', 'text_records'), ('images',
+        'image_records'), ('binary', 'binary_records'), ('font',
+            'font_records')]:
+        tdir = os.path.join(ddir, tdir)
+        os.mkdir(tdir)
+        for rec in getattr(f, attr, []):
+            rec.dump(tdir)
+
+
--- a/src/calibre/ebooks/mobi/reader/headers.py
+++ b/src/calibre/ebooks/mobi/reader/headers.py
@ -186,20 +186,16 @@ class BookHeader(object):
            if len(raw) >= 0xF8:
                self.ncxidx, = struct.unpack_from(b'>L', raw, 0xF4)

-            if self.mobi_version >= 8:
-                self.skelidx, = struct.unpack_from('>L', raw, 0xFC)
-
-                # Index into <div> sections in raw_ml
-                self.dividx, = struct.unpack_from('>L', raw, 0xF8)
-
-                # Index into Other files
-                self.othidx, = struct.unpack_from('>L', raw, 0x104)
+            # Ancient PRC files from Baen can have random values for
+            # mobi_version, so be conservative
+            if self.mobi_version == 8 and len(raw) >= (0xF8 + 16):
+                self.dividx, self.skelidx, self.datpidx, self.othidx = \
+                        struct.unpack_from(b'>4L', raw, 0xF8)

                # need to use the FDST record to find out how to properly
                # unpack the raw_ml into pieces it is simply a table of start
                # and end locations for each flow piece
-                self.fdstidx, = struct.unpack_from('>L', raw, 0xC0)
-                self.fdstcnt, = struct.unpack_from('>L', raw, 0xC4)
+                self.fdstidx, self.fdstcnt = struct.unpack_from(b'>2L', raw, 0xC0)
                # if cnt is 1 or less, fdst section number can be garbage
                if self.fdstcnt <= 1:
                    self.fdstidx = NULL_INDEX
--- a/src/calibre/ebooks/mobi/reader/markup.py
+++ b/src/calibre/ebooks/mobi/reader/markup.py
@ -33,9 +33,11 @@ def update_internal_links(mobi8_reader):
                for m in posfid_index_pattern.finditer(tag):
                    posfid = m.group(1)
                    offset = m.group(2)
-                    filename, idtag = mr.get_id_tag_by_pos_fid(posfid, offset)
+                    filename, idtag = mr.get_id_tag_by_pos_fid(int(posfid, 32),
+                            int(offset, 32))
                    suffix = (b'#' + idtag) if idtag else b''
-                    replacement = filename.encode(mr.header.codec) + suffix
+                    replacement = filename.split('/')[-1].encode(
+                            mr.header.codec) + suffix
                    tag = posfid_index_pattern.sub(replacement, tag, 1)
                srcpieces[j] = tag
        part = ''.join([x.decode(mr.header.codec) for x in srcpieces])
--- a/src/calibre/ebooks/mobi/reader/mobi6.py
+++ b/src/calibre/ebooks/mobi/reader/mobi6.py
@ -107,7 +107,10 @@ class MobiReader(object):
        self.kf8_type = None
        k8i = getattr(self.book_header.exth, 'kf8_header', None)

-        if self.book_header.mobi_version == 8:
+        # Ancient PRC files from Baen can have random values for
+        # mobi_version, so be conservative
+        if (self.book_header.mobi_version == 8 and hasattr(self.book_header,
+            'skelidx')):
            self.kf8_type = 'standalone'
        elif k8i is not None: # Check for joint mobi 6 and kf 8 file
            try:
@ -118,12 +121,17 @@ class MobiReader(object):
                try:
                    self.book_header = BookHeader(self.sections[k8i][0],
                            self.ident, user_encoding, self.log)
-                    # The following are only correct in the Mobi 6
-                    # header not the Mobi 8 header
+
+                    # Only the first_image_index from the MOBI 6 header is
+                    # useful
                    for x in ('first_image_index',):
                        setattr(self.book_header, x, getattr(bh, x))
+
+                    # We need to do this because the MOBI 6 text extract code
+                    # does not know anything about the kf8 offset
                    if hasattr(self.book_header, 'huff_offset'):
                        self.book_header.huff_offset += k8i
+
                    self.kf8_type = 'joint'
                    self.kf8_boundary = k8i-1
                except:
--- a/src/calibre/ebooks/mobi/reader/mobi8.py
+++ b/src/calibre/ebooks/mobi/reader/mobi8.py
@ -230,11 +230,9 @@ class Mobi8Reader(object):

    def get_id_tag_by_pos_fid(self, posfid, offset):
        # first convert kindle:pos:fid and offset info to position in file
-        row = int(posfid, 32)
-        off = int(offset, 32)
-        [insertpos, idtext, filenum, seqnm, startpos, length] = self.elems[row]
-        pos = insertpos + off
-        fname = self.get_file_info(pos).filename
+        insertpos, idtext, filenum, seqnm, startpos, length = self.elems[posfid]
+        pos = insertpos + offset
+        fi = self.get_file_info(pos)
        # an existing "id=" must exist in original xhtml otherwise it would not
        # have worked for linking.  Amazon seems to have added its own
        # additional "aid=" inside tags whose contents seem to represent some
@ -243,7 +241,7 @@ class Mobi8Reader(object):
        # so find the closest "id=" before position the file by actually
        # searching in that file
        idtext = self.get_id_tag(pos)
-        return fname, idtext
+        return '%s/%s'%(fi.type, fi.filename), idtext

    def get_id_tag(self, pos):
        # find the correct tag by actually searching in the destination
@ -254,12 +252,13 @@ class Mobi8Reader(object):
        textblock = self.parts[fi.num]
        id_map = []
        npos = pos - fi.start
-        # if npos inside a tag then search all text before the its end of tag
-        # marker
        pgt = textblock.find(b'>', npos)
        plt = textblock.find(b'<', npos)
-        if pgt < plt:
+        # if npos inside a tag then search all text before the its end of tag marker
+        # else not in a tag need to search the preceding tag
+        if plt == npos or pgt < plt:
            npos = pgt + 1
+        textblock = textblock[0:npos]
        # find id links only inside of tags
        #    inside any < > pair find all "id=' and return whatever is inside
        #    the quotes
@ -316,13 +315,18 @@ class Mobi8Reader(object):

        # Add href and anchor info to the index entries
        for entry in index_entries:
-            pos = entry['pos']
-            fi = self.get_file_info(pos)
-            #print (11111111, fi, entry['pos_fid'])
-            if fi.filename is None:
-                raise ValueError('Index entry has invalid pos: %d'%pos)
-            idtag = self.get_id_tag(pos).decode(self.header.codec)
-            entry['href'] = '%s/%s'%(fi.type, fi.filename)
+            pos_fid = entry['pos_fid']
+            if pos_fid is None:
+                pos = entry['pos']
+                fi = self.get_file_info(pos)
+                if fi.filename is None:
+                    raise ValueError('Index entry has invalid pos: %d'%pos)
+                idtag = self.get_id_tag(pos).decode(self.header.codec)
+                href = '%s/%s'%(fi.type, fi.filename)
+            else:
+                href, idtag = self.get_id_tag_by_pos_fid(*pos_fid)
+
+            entry['href'] = href
            entry['idtag'] = idtag

        # Build the TOC object
--- a/src/calibre/ebooks/rtf/rtfml.py
+++ b/src/calibre/ebooks/rtf/rtfml.py
@ -109,6 +109,7 @@ class RTFMLizer(object):
            if item.spine_position is None:
                stylizer = Stylizer(item.data, item.href, self.oeb_book,
                        self.opts, self.opts.output_profile)
+                self.currently_dumping_item = item
                output += self.dump_text(item.data.find(XHTML('body')), stylizer)
                output += '{\\page }'
        for item in self.oeb_book.spine:
@ -118,6 +119,7 @@ class RTFMLizer(object):
            content = self.remove_tabs(content)
            content = etree.fromstring(content)
            stylizer = Stylizer(content, item.href, self.oeb_book, self.opts, self.opts.output_profile)
+            self.currently_dumping_item = item
            output += self.dump_text(content.find(XHTML('body')), stylizer)
            output += '{\\page }'
        output += self.footer()
@ -160,9 +162,15 @@ class RTFMLizer(object):

        for item in self.oeb_book.manifest:
            if item.media_type in OEB_RASTER_IMAGES:
-                src = os.path.basename(item.href)
-                data, width, height = self.image_to_hexstring(item.data)
-                text = text.replace('SPECIAL_IMAGE-%s-REPLACE_ME' % src, '\n\n{\\*\\shppict{\\pict\\picw%i\\pich%i\\jpegblip \n%s\n}}\n\n' % (width, height, data))
+                src = item.href
+                try:
+                    data, width, height = self.image_to_hexstring(item.data)
+                except:
+                    self.log.warn('Image %s is corrupted, ignoring'%item.href)
+                    repl = '\n\n'
+                else:
+                    repl = '\n\n{\\*\\shppict{\\pict\\jpegblip\\picw%i\\pich%i \n%s\n}}\n\n' % (width, height, data)
+                text = text.replace('SPECIAL_IMAGE-%s-REPLACE_ME' % src, repl)
        return text

    def image_to_hexstring(self, data):
@ -205,7 +213,8 @@ class RTFMLizer(object):
        return text

    def dump_text(self, elem, stylizer, tag_stack=[]):
-        from calibre.ebooks.oeb.base import XHTML_NS, namespace, barename
+        from calibre.ebooks.oeb.base import (XHTML_NS, namespace, barename,
+                urlnormalize)

        if not isinstance(elem.tag, basestring) \
           or namespace(elem.tag) != XHTML_NS:
@ -236,7 +245,7 @@ class RTFMLizer(object):
        if tag == 'img':
            src = elem.get('src')
            if src:
-                src = os.path.basename(elem.get('src'))
+                src = urlnormalize(self.currently_dumping_item.abshref(src))
                block_start = ''
                block_end = ''
                if 'block' not in tag_stack:
--- a/src/calibre/gui2/actions/add.py
+++ b/src/calibre/gui2/actions/add.py
@ -70,6 +70,9 @@ class AddAction(InterfaceAction):
        self.add_menu.addSeparator()
        ma('add-formats', _('Add files to selected book records'),
                triggered=self.add_formats, shortcut=_('Shift+A'))
+        self.add_menu.addSeparator()
+        ma('add-config', _('Configure the adding of books'),
+                triggered=self.add_config)

        self.qaction.triggered.connect(self.add_books)

@ -78,6 +81,11 @@ class AddAction(InterfaceAction):
        for action in list(self.add_menu.actions())[1:]:
            action.setEnabled(enabled)

+    def add_config(self):
+        self.gui.iactions['Preferences'].do_config(
+            initial_plugin=('Import/Export', 'Adding'),
+            close_after_initial=True)
+
    def add_formats(self, *args):
        if self.gui.stack.currentIndex() != 0:
            return
--- a/src/calibre/gui2/actions/catalog.py
+++ b/src/calibre/gui2/actions/catalog.py
@ -13,6 +13,7 @@ from calibre.gui2 import choose_dir, error_dialog, warning_dialog
 from calibre.gui2.tools import generate_catalog
 from calibre.utils.config import dynamic
 from calibre.gui2.actions import InterfaceAction
+from calibre import sanitize_file_name_unicode

 class GenerateCatalogAction(InterfaceAction):

@ -89,7 +90,8 @@ class GenerateCatalogAction(InterfaceAction):
                    _('Select destination for %(title)s.%(fmt)s') % dict(
                        title=job.catalog_title, fmt=job.fmt.lower()))
            if export_dir:
-                destination = os.path.join(export_dir, '%s.%s' % (job.catalog_title, job.fmt.lower()))
+                destination = os.path.join(export_dir, '%s.%s' % (
+                    sanitize_file_name_unicode(job.catalog_title), job.fmt.lower()))
                shutil.copyfile(job.catalog_file_path, destination)


--- a/src/calibre/gui2/actions/copy_to_library.py
+++ b/src/calibre/gui2/actions/copy_to_library.py
@ -13,7 +13,8 @@ from contextlib import closing
 from PyQt4.Qt import QToolButton

 from calibre.gui2.actions import InterfaceAction
-from calibre.gui2 import error_dialog, Dispatcher, warning_dialog, gprefs
+from calibre.gui2 import (error_dialog, Dispatcher, warning_dialog, gprefs,
+        info_dialog)
 from calibre.gui2.dialogs.progress import ProgressDialog
 from calibre.utils.config import prefs, tweaks
 from calibre.utils.date import now
@ -30,6 +31,7 @@ class Worker(Thread): # {{{
        self.progress = progress
        self.done = done
        self.delete_after = delete_after
+        self.auto_merged_ids = {}

    def run(self):
        try:
@ -79,6 +81,8 @@ class Worker(Thread): # {{{
            if prefs['add_formats_to_existing']:
                identical_book_list = newdb.find_identical_books(mi)
                if identical_book_list: # books with same author and nearly same title exist in newdb
+                    self.auto_merged_ids[x] = _('%(title)s by %(author)s')%\
+                    dict(title=mi.title, author=mi.format_field('authors')[1])
                    automerged = True
                    seen_fmts = set()
                    for identical_book in identical_book_list:
@ -196,6 +200,15 @@ class CopyToLibraryAction(InterfaceAction):
            self.gui.status_bar.show_message(
                    _('Copied %(num)d books to %(loc)s') %
                    dict(num=len(ids), loc=loc), 2000)
+            if self.worker.auto_merged_ids:
+                books = '\n'.join(self.worker.auto_merged_ids.itervalues())
+                info_dialog(self.gui, _('Auto merged'),
+                        _('Some books were automatically merged into existing '
+                            'records in the target library. Click Show '
+                            'details to see which ones. This behavior is '
+                            'controlled by the Auto merge option in '
+                            'Preferences->Adding books.'), det_msg=books,
+                        show=True)
            if delete_after and self.worker.processed:
                v = self.gui.library_view
                ci = v.currentIndex()
--- a/src/calibre/gui2/comments_editor.py
+++ b/src/calibre/gui2/comments_editor.py
@ -9,10 +9,10 @@ import re, os

 from lxml import html

-from PyQt4.Qt import QApplication, QFontInfo, QSize, QWidget, QPlainTextEdit, \
-    QToolBar, QVBoxLayout, QAction, QIcon, Qt, QTabWidget, QUrl, \
-    QSyntaxHighlighter, QColor, QChar, QColorDialog, QMenu, QInputDialog, \
-    QHBoxLayout
+from PyQt4.Qt import (QApplication, QFontInfo, QSize, QWidget, QPlainTextEdit,
+    QToolBar, QVBoxLayout, QAction, QIcon, Qt, QTabWidget, QUrl,
+    QSyntaxHighlighter, QColor, QChar, QColorDialog, QMenu, QInputDialog,
+    QHBoxLayout, QKeySequence)
 from PyQt4.QtWebKit import QWebView, QWebPage

 from calibre.ebooks.chardet import xml_to_unicode
@ -32,6 +32,7 @@ class PageAction(QAction): # {{{
                type=Qt.QueuedConnection)
        self.page_action.changed.connect(self.update_state,
                type=Qt.QueuedConnection)
+        self.update_state()

    @property
    def page_action(self):
@ -66,6 +67,12 @@ class EditorWidget(QWebView): # {{{

        self.comments_pat = re.compile(r'<!--.*?-->', re.DOTALL)

+        extra_shortcuts = {
+                'ToggleBold': 'Bold',
+                'ToggleItalic': 'Italic',
+                'ToggleUnderline': 'Underline',
+        }
+
        for wac, name, icon, text, checkable in [
                ('ToggleBold', 'bold', 'format-text-bold', _('Bold'), True),
                ('ToggleItalic', 'italic', 'format-text-italic', _('Italic'),
@ -106,6 +113,9 @@ class EditorWidget(QWebView): # {{{
            ]:
            ac = PageAction(wac, icon, text, checkable, self)
            setattr(self, 'action_'+name, ac)
+            ss = extra_shortcuts.get(wac, None)
+            if ss:
+                ac.setShortcut(QKeySequence(getattr(QKeySequence, ss)))

        self.action_color = QAction(QIcon(I('format-text-color')), _('Foreground color'),
                self)
--- a/src/calibre/gui2/complete.py
+++ b/src/calibre/gui2/complete.py
@ -6,8 +6,8 @@ __copyright__ = '2011, Kovid Goyal <kovid@kovidgoyal.net>'
 __docformat__ = 'restructuredtext en'


-from PyQt4.Qt import QLineEdit, QAbstractListModel, Qt, \
-        QApplication, QCompleter
+from PyQt4.Qt import (QLineEdit, QAbstractListModel, Qt,
+        QApplication, QCompleter, QMetaObject)

 from calibre.utils.icu import sort_key, lower
 from calibre.gui2 import NONE
@ -182,14 +182,27 @@ class MultiCompleteComboBox(EnComboBox):
    def set_add_separator(self, what):
        self.lineEdit().set_add_separator(what)

-
+    def show_initial_value(self, what):
+        '''
+        Show an initial value. Handle the case of the initial value being blank
+        correctly (on Qt 4.8.0 having a blank value causes the first value from
+        the completer to be shown, when the event loop runs).
+        '''
+        what = unicode(what)
+        le = self.lineEdit()
+        if not what.strip():
+            QMetaObject.invokeMethod(self, 'clearEditText',
+                    Qt.QueuedConnection)
+        else:
+            self.setEditText(what)
+            le.selectAll()

 if __name__ == '__main__':
    from PyQt4.Qt import QDialog, QVBoxLayout
    app = QApplication([])
    d = QDialog()
    d.setLayout(QVBoxLayout())
-    le = MultiCompleteLineEdit(d)
+    le = MultiCompleteComboBox(d)
    d.layout().addWidget(le)
    le.all_items = ['one', 'otwo', 'othree', 'ooone', 'ootwo', 'oothree']
    d.exec_()
--- a/src/calibre/gui2/library/delegates.py
+++ b/src/calibre/gui2/library/delegates.py
@ -128,8 +128,7 @@ class TextDelegate(QStyledItemDelegate): # {{{
            for item in sorted(complete_items, key=sort_key):
                editor.addItem(item)
            ct = index.data(Qt.DisplayRole).toString()
-            editor.setEditText(ct)
-            editor.lineEdit().selectAll()
+            editor.show_initial_value(ct)
        else:
            editor = EnLineEdit(parent)
        return editor
@ -170,8 +169,7 @@ class CompleteDelegate(QStyledItemDelegate): # {{{
            for item in sorted(all_items, key=sort_key):
                editor.addItem(item)
            ct = index.data(Qt.DisplayRole).toString()
-            editor.setEditText(ct)
-            editor.lineEdit().selectAll()
+            editor.show_initial_value(ct)
        else:
            editor = EnLineEdit(parent)
        return editor
@ -190,8 +188,7 @@ class LanguagesDelegate(QStyledItemDelegate): # {{{
        editor = LanguagesEdit(parent=parent)
        editor.init_langs(index.model().db)
        ct = index.data(Qt.DisplayRole).toString()
-        editor.setEditText(ct)
-        editor.lineEdit().selectAll()
+        editor.show_initial_value(ct)
        return editor

    def setModelData(self, editor, model, index):
--- a/src/calibre/gui2/metadata/single_download.py
+++ b/src/calibre/gui2/metadata/single_download.py
@ -882,6 +882,11 @@ class FullFetch(QDialog): # {{{
        self.covers_widget.chosen.connect(self.ok_clicked)
        self.stack.addWidget(self.covers_widget)

+        # Workaround for Qt 4.8.0 bug that causes the frame of the window to go
+        # off the top of the screen if a max height is not set for the
+        # QWebView. Seems to only happen on windows, but keep it for all
+        # platforms just in case.
+        self.identify_widget.comments_view.setMaximumHeight(500)
        self.resize(850, 550)

        self.finished.connect(self.cleanup)
--- a/src/calibre/gui2/store/search_result.py
+++ b/src/calibre/gui2/store/search_result.py
@ -29,4 +29,4 @@ class SearchResult(object):
        self.plugin_author = ''

    def __eq__(self, other):
-        return self.title == other.title and self.author == other.author and self.store_name == other.store_name
+        return self.title == other.title and self.author == other.author and self.store_name == other.store_name and self.formats == other.formats
--- a/src/calibre/gui2/store/stores/amazon_de_plugin.py
+++ b/src/calibre/gui2/store/stores/amazon_de_plugin.py
@ -41,7 +41,9 @@ class AmazonDEKindleStore(StorePlugin):

        counter = max_results
        with closing(br.open(url, timeout=timeout)) as f:
-            doc = html.fromstring(f.read().decode('latin-1', 'replace'))
+            # doc = html.fromstring(f.read().decode('latin-1', 'replace'))
+            # Apparently amazon Europe  is responding in UTF-8 now
+            doc = html.fromstring(f.read())

            data_xpath = '//div[contains(@class, "result") and contains(@class, "product")]'
            format_xpath = './/span[@class="format"]/text()'
@ -65,8 +67,8 @@ class AmazonDEKindleStore(StorePlugin):

                cover_url = ''.join(data.xpath(cover_xpath))

-                title = ''.join(data.xpath('.//div[@class="title"]/a/text()'))
-                price = ''.join(data.xpath('.//div[@class="newPrice"]/span/text()'))
+                title = ''.join(data.xpath('.//a[@class="title"]/text()'))
+                price = ''.join(data.xpath('.//span[@class="price"]/text()'))

                author = ''.join(data.xpath('.//div[@class="title"]/span[@class="ptBrand"]/text()'))
                if author.startswith('von '):
--- a/src/calibre/gui2/store/stores/amazon_es_plugin.py
+++ b/src/calibre/gui2/store/stores/amazon_es_plugin.py
@ -37,7 +37,9 @@ class AmazonESKindleStore(StorePlugin):

        counter = max_results
        with closing(br.open(url, timeout=timeout)) as f:
-            doc = html.fromstring(f.read().decode('latin-1', 'replace'))
+            # doc = html.fromstring(f.read().decode('latin-1', 'replace'))
+            # Apparently amazon Europe is responding in UTF-8 now
+            doc = html.fromstring(f.read())

            data_xpath = '//div[contains(@class, "result") and contains(@class, "product")]'
            format_xpath = './/span[@class="format"]/text()'
@ -61,8 +63,8 @@ class AmazonESKindleStore(StorePlugin):

                cover_url = ''.join(data.xpath(cover_xpath))

-                title = ''.join(data.xpath('.//div[@class="title"]/a/text()'))
-                price = ''.join(data.xpath('.//div[@class="newPrice"]/span/text()'))
+                title = ''.join(data.xpath('.//a[@class="title"]/text()'))
+                price = ''.join(data.xpath('.//span[@class="price"]/text()'))
                author = unicode(''.join(data.xpath('.//div[@class="title"]/span[@class="ptBrand"]/text()')))
                if author.startswith('de '):
                    author = author[3:]
--- a/src/calibre/gui2/store/stores/amazon_fr_plugin.py
+++ b/src/calibre/gui2/store/stores/amazon_fr_plugin.py
@ -39,7 +39,7 @@ class AmazonFRKindleStore(StorePlugin):
        counter = max_results
        with closing(br.open(url, timeout=timeout)) as f:
            # doc = html.fromstring(f.read().decode('latin-1', 'replace'))
-            # Apparently amazon.fr is responding in UTF-8 now
+            # Apparently amazon Europe is responding in UTF-8 now
            doc = html.fromstring(f.read())

            data_xpath = '//div[contains(@class, "result") and contains(@class, "product")]'
@ -64,8 +64,8 @@ class AmazonFRKindleStore(StorePlugin):

                cover_url = ''.join(data.xpath(cover_xpath))

-                title = ''.join(data.xpath('.//div[@class="title"]/a/text()'))
-                price = ''.join(data.xpath('.//div[@class="newPrice"]/span/text()'))
+                title = ''.join(data.xpath('.//a[@class="title"]/text()'))
+                price = ''.join(data.xpath('.//span[@class="price"]/text()'))
                author = unicode(''.join(data.xpath('.//div[@class="title"]/span[@class="ptBrand"]/text()')))
                if author.startswith('de '):
                    author = author[3:]
--- a/src/calibre/gui2/store/stores/amazon_it_plugin.py
+++ b/src/calibre/gui2/store/stores/amazon_it_plugin.py
@ -37,7 +37,9 @@ class AmazonITKindleStore(StorePlugin):

        counter = max_results
        with closing(br.open(url, timeout=timeout)) as f:
-            doc = html.fromstring(f.read().decode('latin-1', 'replace'))
+            # doc = html.fromstring(f.read().decode('latin-1', 'replace'))
+            # Apparently amazon Europe is responding in UTF-8 now
+            doc = html.fromstring(f.read())

            data_xpath = '//div[contains(@class, "result") and contains(@class, "product")]'
            format_xpath = './/span[@class="format"]/text()'
@ -61,8 +63,8 @@ class AmazonITKindleStore(StorePlugin):

                cover_url = ''.join(data.xpath(cover_xpath))

-                title = ''.join(data.xpath('.//div[@class="title"]/a/text()'))
-                price = ''.join(data.xpath('.//div[@class="newPrice"]/span/text()'))
+                title = ''.join(data.xpath('.//a[@class="title"]/text()'))
+                price = ''.join(data.xpath('.//span[@class="price"]/text()'))
                author = unicode(''.join(data.xpath('.//div[@class="title"]/span[@class="ptBrand"]/text()')))
                if author.startswith('di '):
                    author = author[3:]
--- a/src/calibre/gui2/store/stores/amazon_uk_plugin.py
+++ b/src/calibre/gui2/store/stores/amazon_uk_plugin.py
@ -38,7 +38,8 @@ class AmazonUKKindleStore(StorePlugin):

        counter = max_results
        with closing(br.open(url, timeout=timeout)) as f:
-            doc = html.fromstring(f.read().decode('latin-1', 'replace'))
+            # Apparently amazon Europe is responding in UTF-8 now
+            doc = html.fromstring(f.read())

            data_xpath = '//div[contains(@class, "result") and contains(@class, "product")]'
            format_xpath = './/span[@class="format"]/text()'
@ -62,8 +63,8 @@ class AmazonUKKindleStore(StorePlugin):

                cover_url = ''.join(data.xpath(cover_xpath))

-                title = ''.join(data.xpath('.//div[@class="title"]/a/text()'))
-                price = ''.join(data.xpath('.//div[@class="newPrice"]/span/text()'))
+                title = ''.join(data.xpath('.//a[@class="title"]/text()'))
+                price = ''.join(data.xpath('.//span[@class="price"]/text()'))

                author = ''.join(data.xpath('.//div[@class="title"]/span[@class="ptBrand"]/text()'))
                if author.startswith('by '):
--- a/src/calibre/gui2/store/stores/bn_plugin.py
+++ b/src/calibre/gui2/store/stores/bn_plugin.py
@ -62,7 +62,7 @@ class BNStore(BasicStoreConfig, StorePlugin):

                title = ''.join(data.xpath('.//p[@class="title"]//span[@class="name"]/text()'))
                author = ', '.join(data.xpath('.//ul[@class="contributors"]//li[position()>1]//a/text()'))
-                price = ''.join(data.xpath('.//table[@class="displayed-formats"]//a[@class="subtle"]/text()'))
+                price = ''.join(data.xpath('.//table[@class="displayed-formats"]//a[contains(@class, "bn-price")]/text()'))

                counter -= 1

--- a/src/calibre/gui2/store/stores/diesel_ebooks_plugin.py
+++ b/src/calibre/gui2/store/stores/diesel_ebooks_plugin.py
@ -7,7 +7,7 @@ __copyright__ = '2011, John Schember <john@nachtimwald.com>'
 __docformat__ = 'restructuredtext en'

 import random
-import urllib2
+import urllib
 from contextlib import closing

 from lxml import html
@ -22,7 +22,7 @@ from calibre.gui2.store.search_result import SearchResult
 from calibre.gui2.store.web_store_dialog import WebStoreDialog

 class DieselEbooksStore(BasicStoreConfig, StorePlugin):
-        
+
    def open(self, parent=None, detail_item=None, external=False):
        url = 'http://www.diesel-ebooks.com/'

@ -33,7 +33,7 @@ class DieselEbooksStore(BasicStoreConfig, StorePlugin):

        detail_url = None
        if detail_item:
-            detail_url = url + detail_item + aff_id
+            detail_url = detail_item + aff_id
        url = url + aff_id

        if external or self.config.get('open_external', False):
@ -45,54 +45,46 @@ class DieselEbooksStore(BasicStoreConfig, StorePlugin):
            d.exec_()

    def search(self, query, max_results=10, timeout=60):
-        url = 'http://www.diesel-ebooks.com/index.php?page=seek&id[m]=&id[c]=scope%253Dinventory&id[q]=' + urllib2.quote(query)
-        
+        url = 'http://www.diesel-ebooks.com/index.php?page=seek&id[m]=&id[c]=scope%253Dinventory&id[q]=' + urllib.quote_plus(query)
+
        br = browser()
-        
+
        counter = max_results
        with closing(br.open(url, timeout=timeout)) as f:
            doc = html.fromstring(f.read())
-            for data in doc.xpath('//div[@class="item clearfix"]'):
-                data = html.fromstring(html.tostring(data))
+            for data in doc.xpath('//div[contains(@class, "item")]'):
                if counter <= 0:
                    break

                id = ''.join(data.xpath('div[@class="cover"]/a/@href'))
                if not id or '/item/' not in id:
                    continue
-                a, b, id = id.partition('/item/')

                cover_url = ''.join(data.xpath('div[@class="cover"]//img/@src'))

-                title = ''.join(data.xpath('.//div[@class="content"]//h2/text()'))
-                author = ''.join(data.xpath('//div[@class="content"]//div[@class="author"]/a/text()'))
+                title = ''.join(data.xpath('.//div[@class="content"]//h2/a/text()'))
+                author = ''.join(data.xpath('.//div[@class="content"]/span//a/text()'))
                price = ''
-                price_elem = data.xpath('//td[@class="price"]/text()')
+                price_elem = data.xpath('.//div[@class="price_fat"]//h1/text()')
                if price_elem:
                    price = price_elem[0]

-                formats = ', '.join(data.xpath('.//td[@class="format"]/text()'))
+                formats = ', '.join(data.xpath('.//div[@class="book-info"]//text()')).strip()
+                a, b, formats = formats.partition('Format:')
+                drm = SearchResult.DRM_LOCKED
+                if 'drm free' not in formats.lower():
+                    drm = SearchResult.DRM_UNLOCKED
+

                counter -= 1
-                
+
                s = SearchResult()
                s.cover_url = cover_url
                s.title = title.strip()
                s.author = author.strip()
                s.price = price.strip()
-                s.detail_item = '/item/' + id.strip()
+                s.detail_item = id.strip()
                s.formats = formats
-                
-                yield s
+                s.drm = drm

-    def get_details(self, search_result, timeout):
-        url = 'http://www.diesel-ebooks.com/item/'
-        
-        br = browser()
-        with closing(br.open(url + search_result.detail_item, timeout=timeout)) as nf:
-            idata = html.fromstring(nf.read())
-            if idata.xpath('boolean(//table[@class="format-info"]//tr[contains(th, "DRM") and contains(td, "No")])'):
-                search_result.drm = SearchResult.DRM_UNLOCKED
-            else:
-                search_result.drm = SearchResult.DRM_LOCKED
-        return True
+                yield s
--- a/src/calibre/gui2/store/stores/foyles_uk_plugin.py
+++ b/src/calibre/gui2/store/stores/foyles_uk_plugin.py
@ -60,10 +60,6 @@ class FoylesUKStore(BasicStoreConfig, StorePlugin):
                    continue

                cover_url = ''.join(data.xpath('.//a[@class="Jacket"]/img/@src'))
-                if cover_url:
-                    cover_url = 'http://www.foyles.co.uk' + cover_url
-                #print(cover_url)
-
                title = ''.join(data.xpath('.//a[@class="Title"]/text()'))
                author = ', '.join(data.xpath('.//span[@class="Author"]/text()'))
                price = ''.join(data.xpath('./ul/li[@class="Strong"]/text()'))
--- a/src/calibre/gui2/store/stores/kobo_plugin.py
+++ b/src/calibre/gui2/store/stores/kobo_plugin.py
@ -68,7 +68,7 @@ class KoboStore(BasicStoreConfig, StorePlugin):
                cover_url = ''.join(data.xpath('.//div[@class="SearchImageContainer"]//img[1]/@src'))
                
                title = ''.join(data.xpath('.//div[@class="SCItemHeader"]/h1/a[1]/text()'))
-                author = ''.join(data.xpath('.//div[@class="SCItemSummary"]/span/a[1]/text()'))
+                author = ', '.join(data.xpath('.//div[@class="SCItemSummary"]//span//a/text()'))
                drm = data.xpath('boolean(.//span[@class="SCAvailibilityFormatsText" and contains(text(), "DRM")])')

                counter -= 1
--- a/src/calibre/gui2/store/stores/waterstones_uk_plugin.py
+++ b/src/calibre/gui2/store/stores/waterstones_uk_plugin.py
@ -57,7 +57,7 @@ class WaterstonesUKStore(BasicStoreConfig, StorePlugin):
                cover_url = ''.join(data.xpath('.//div[@class="image"]/a/img/@src'))
                title = ''.join(data.xpath('./div/div/h2/a/text()'))
                author = ', '.join(data.xpath('.//p[@class="byAuthor"]/a/text()'))
-                price = ''.join(data.xpath('.//p[@class="price"]/span[@class="priceStandard"]/text()'))
+                price = ''.join(data.xpath('.//p[@class="price"]/span[@class="priceRed2"]/text()'))
                drm = data.xpath('boolean(.//td[@headers="productFormat" and contains(., "DRM")])')
                pdf = data.xpath('boolean(.//td[@headers="productFormat" and contains(., "PDF")])')
                epub = data.xpath('boolean(.//td[@headers="productFormat" and contains(., "EPUB")])')
--- a/src/calibre/gui2/store/stores/wizards_tower_books_plugin.py
+++ b/src/calibre/gui2/store/stores/wizards_tower_books_plugin.py
@ -1,118 +0,0 @@
-# -*- coding: utf-8 -*-
-
-from __future__ import (unicode_literals, division, absolute_import, print_function)
-
-__license__ = 'GPL 3'
-__copyright__ = '2011, John Schember <john@nachtimwald.com>'
-__docformat__ = 'restructuredtext en'
-
-import urllib
-from contextlib import closing
-
-from lxml import html
-
-from PyQt4.Qt import QUrl
-
-from calibre import browser, url_slash_cleaner
-from calibre.gui2 import open_url
-from calibre.gui2.store import StorePlugin
-from calibre.gui2.store.basic_config import BasicStoreConfig
-from calibre.gui2.store.search_result import SearchResult
-from calibre.gui2.store.web_store_dialog import WebStoreDialog
-
-class WizardsTowerBooksStore(BasicStoreConfig, StorePlugin):
-
-    url = 'http://www.wizardstowerbooks.com/'
-
-    def open(self, parent=None, detail_item=None, external=False):
-        if detail_item:
-            detail_item = self.url + detail_item
-
-        if external or self.config.get('open_external', False):
-            open_url(QUrl(url_slash_cleaner(detail_item)))
-        else:
-            d = WebStoreDialog(self.gui, self.url, parent, detail_item)
-            d.setWindowTitle(self.name)
-            d.set_tags(self.config.get('tags', ''))
-            d.exec_()
-
-    def search(self, query, max_results=10, timeout=60):
-        url = 'http://www.wizardstowerbooks.com/search.html?for=' + urllib.quote(query)
-
-        br = browser()
-
-        counter = max_results
-        with closing(br.open(url, timeout=timeout)) as f:
-            doc = html.fromstring(f.read())
-            if 'search.html' in f.geturl():
-                for data in doc.xpath('//table[@class="gridp"]//td'):
-                    if counter <= 0:
-                        break
-    
-                    id = ''.join(data.xpath('.//span[@class="prti"]/a/@href'))
-                    id = id.strip()
-                    if not id:
-                        continue
-    
-                    cover_url = ''.join(data.xpath('.//div[@class="prim"]/a/img/@src'))
-                    cover_url = url_slash_cleaner(self.url + cover_url.strip())
-    
-                    price = ''.join(data.xpath('.//font[@class="selling_price"]//text()'))
-                    price = price.strip()
-                    if not price:
-                        continue
-    
-                    title = ''.join(data.xpath('.//span[@class="prti"]/a/b/text()'))
-                    author = ''.join(data.xpath('.//p[@class="last"]/text()'))
-                    a, b, author = author.partition(' by ')
-    
-                    counter -= 1
-    
-                    s = SearchResult()
-                    s.cover_url = cover_url
-                    s.title = title.strip()
-                    s.author = author.strip()
-                    s.price = price.strip()
-                    s.detail_item = id.strip()
-                    s.drm = SearchResult.DRM_UNLOCKED
-    
-                    yield s
-            # Exact match brought us to the books detail page.
-            else:
-                s = SearchResult()
-                
-                cover_url = ''.join(doc.xpath('//div[@id="image"]/a/img[@title="Zoom"]/@src')).strip()
-                s.cover_url = url_slash_cleaner(self.url + cover_url.strip())
-                
-                s.title = ''.join(doc.xpath('//form[@name="details"]/h1/text()')).strip()
-                
-                authors = doc.xpath('//p[contains(., "Author:")]//text()')
-                author_index = None
-                for i, a in enumerate(authors):
-                    if 'author' in a.lower():
-                        author_index = i + 1
-                        break
-                if author_index is not None and len(authors) > author_index:
-                    a = authors[author_index]
-                    a = a.replace(u'\xa0', '')
-                    s.author = a.strip() 
-                
-                s.price = ''.join(doc.xpath('//span[@id="price_selling"]//text()')).strip()
-                s.detail_item = f.geturl().replace(self.url, '').strip()
-                s.formats = ', '.join(doc.xpath('//select[@id="N1_"]//option//text()'))
-                s.drm = SearchResult.DRM_UNLOCKED
-                
-                yield s
-
-    def get_details(self, search_result, timeout):
-        if search_result.formats:
-            return False
-        
-        br = browser()
-        with closing(br.open(url_slash_cleaner(self.url + search_result.detail_item), timeout=timeout)) as nf:
-            idata = html.fromstring(nf.read())
-
-            formats = ', '.join(idata.xpath('//select[@id="N1_"]//option//text()'))
-            search_result.formats = formats.upper()
-
-        return True
--- a/src/calibre/gui2/store/stores/woblink_plugin.py
+++ b/src/calibre/gui2/store/stores/woblink_plugin.py
@ -80,6 +80,7 @@ class WoblinkStore(BasicStoreConfig, StorePlugin):
                s.author = author.strip()
                s.price = price + ' zł'
                s.detail_item = id.strip()
+                
                # MOBI should be send first,
                if 'MOBI' in formats:
                    t = SearchResult()
@ -91,12 +92,21 @@ class WoblinkStore(BasicStoreConfig, StorePlugin):
                    t.drm = SearchResult.DRM_UNLOCKED
                    t.formats = 'MOBI'
                    formats.remove('MOBI')
+                    
                    counter -= 1
                    yield t
                    
                # and the remaining formats (if any) next
                if formats:
+                    s = SearchResult()
+                    s.cover_url = 'http://woblink.com' + cover_url
+                    s.title = title.strip()
+                    s.author = author.strip()
+                    s.price = price + ' zł'
+                    s.detail_item = id.strip()
+                    
                    s.drm = SearchResult.DRM_LOCKED
                    s.formats = ', '.join(formats)
+                    
                    counter -= 1
                    yield s
--- a/src/calibre/gui2/tag_browser/model.py
+++ b/src/calibre/gui2/tag_browser/model.py
@ -1170,6 +1170,8 @@ class TagsModel(QAbstractItemModel): # {{{
                        charclass = ''.join(letters_seen)
                        if k == 'author_sort':
                            expr = r'%s:"~(^[%s])|(&\s*[%s])"'%(k, charclass, charclass)
+                        elif k == 'series':
+                            expr = r'series_sort:"~^[%s]"'%(charclass)
                        else:
                            expr = r'%s:"~^[%s]"'%(k, charclass)
                        if node_searches[tag_item.tag.state] == 'true':
--- a/src/calibre/gui2/viewer/config.ui
+++ b/src/calibre/gui2/viewer/config.ui
@ -255,7 +255,10 @@
          </widget>
         </item>
         <item row="3" column="1">
-          <widget class="QSpinBox" name="max_view_width">
+          <widget class="QSpinBox" name="max_fs_width">
+           <property name="toolTip">
+            <string>Set the maximum width that the book's text and pictures will take when in fullscreen mode. This allows you to read the book text without it becoming too wide.</string>
+           </property>
           <property name="suffix">
            <string> px</string>
           </property>
@ -270,10 +273,10 @@
         <item row="3" column="0">
          <widget class="QLabel" name="label_7">
           <property name="text">
-            <string>Maximum &amp;view width:</string>
+            <string>Maximum text width in &amp;fullscreen:</string>
           </property>
           <property name="buddy">
-            <cstring>max_view_width</cstring>
+            <cstring>max_fs_width</cstring>
           </property>
          </widget>
         </item>
@ -350,7 +353,7 @@
  <tabstop>serif_family</tabstop>
  <tabstop>sans_family</tabstop>
  <tabstop>mono_family</tabstop>
-  <tabstop>max_view_width</tabstop>
+  <tabstop>max_fs_width</tabstop>
  <tabstop>opt_remember_window_size</tabstop>
  <tabstop>buttonBox</tabstop>
 </tabstops>
--- a/src/calibre/gui2/viewer/documentview.py
+++ b/src/calibre/gui2/viewer/documentview.py
@ -8,11 +8,11 @@ import os, math, re, glob, sys, zipfile
 from base64 import b64encode
 from functools import partial

-from PyQt4.Qt import (QSize, QSizePolicy, QUrl, SIGNAL, Qt, QTimer,
+from PyQt4.Qt import (QSize, QSizePolicy, QUrl, SIGNAL, Qt,
                     QPainter, QPalette, QBrush, QFontDatabase, QDialog,
                     QColor, QPoint, QImage, QRegion, QVariant, QIcon,
                     QFont, pyqtSignature, QAction, QByteArray, QMenu,
-                     pyqtSignal, QSwipeGesture)
+                     pyqtSignal, QSwipeGesture, QApplication)
 from PyQt4.QtWebKit import QWebPage, QWebView, QWebSettings

 from calibre.utils.config import Config, StringConfig
@ -46,8 +46,10 @@ def config(defaults=None):
        help=_('Remember last used window size'))
    c.add_opt('user_css', default='',
              help=_('Set the user CSS stylesheet. This can be used to customize the look of all books.'))
-    c.add_opt('max_view_width', default=6000,
-            help=_('Maximum width of the viewer window, in pixels.'))
+    c.add_opt('max_fs_width', default=800,
+        help=_("Set the maximum width that the book's text and pictures will take"
+        " when in fullscreen mode. This allows you to read the book text"
+        " without it becoming too wide."))
    c.add_opt('fit_images', default=True,
            help=_('Resize images larger than the viewer window to fit inside it'))
    c.add_opt('hyphenate', default=False, help=_('Hyphenate text'))
@ -101,7 +103,7 @@ class ConfigDialog(QDialog, Ui_Dialog):
        self.standard_font.setCurrentIndex({'serif':0, 'sans':1, 'mono':2}[opts.standard_font])
        self.css.setPlainText(opts.user_css)
        self.css.setToolTip(_('Set the user CSS stylesheet. This can be used to customize the look of all books.'))
-        self.max_view_width.setValue(opts.max_view_width)
+        self.max_fs_width.setValue(opts.max_fs_width)
        with zipfile.ZipFile(P('viewer/hyphenate/patterns.zip',
            allow_user_override=False), 'r') as zf:
            pats = [x.split('.')[0].replace('-', '_') for x in zf.namelist()]
@ -144,7 +146,7 @@ class ConfigDialog(QDialog, Ui_Dialog):
        c.set('user_css', unicode(self.css.toPlainText()))
        c.set('remember_window_size', self.opt_remember_window_size.isChecked())
        c.set('fit_images', self.opt_fit_images.isChecked())
-        c.set('max_view_width', int(self.max_view_width.value()))
+        c.set('max_fs_width', int(self.max_fs_width.value()))
        c.set('hyphenate', self.hyphenate.isChecked())
        c.set('remember_current_page', self.opt_remember_current_page.isChecked())
        c.set('wheel_flips_pages', self.opt_wheel_flips_pages.isChecked())
@ -182,16 +184,16 @@ class Document(QWebPage): # {{{
                self.misc_config()
                self.after_load()

-    def __init__(self, shortcuts, parent=None, resize_callback=lambda: None,
-            debug_javascript=False):
+    def __init__(self, shortcuts, parent=None, debug_javascript=False):
        QWebPage.__init__(self, parent)
        self.setObjectName("py_bridge")
        self.debug_javascript = debug_javascript
-        self.resize_callback = resize_callback
        self.current_language = None
        self.loaded_javascript = False
        self.js_loader = JavaScriptLoader(
                    dynamic_coffeescript=self.debug_javascript)
+        self.initial_left_margin = self.initial_right_margin = u''
+        self.in_fullscreen_mode = False

        self.setLinkDelegationPolicy(self.DelegateAllLinks)
        self.scroll_marks = []
@ -239,6 +241,9 @@ class Document(QWebPage): # {{{
        self.enable_page_flip = self.page_flip_duration > 0.1
        self.font_magnification_step = opts.font_magnification_step
        self.wheel_flips_pages = opts.wheel_flips_pages
+        screen_width = QApplication.desktop().screenGeometry().width()
+        # Leave some space for the scrollbar and some border
+        self.max_fs_width = min(opts.max_fs_width, screen_width-50)

    def fit_images(self):
        if self.do_fit_images:
@ -252,12 +257,6 @@ class Document(QWebPage): # {{{
        if self.loaded_javascript:
            return
        self.loaded_javascript = True
-        self.javascript(
-            '''
-            window.onresize = function(event) {
-                window.py_bridge.window_resized();
-            }
-            ''')
        self.loaded_lang = self.js_loader(self.mainFrame().evaluateJavaScript,
                self.current_language, self.hyphenate_default_lang)

@ -274,15 +273,35 @@ class Document(QWebPage): # {{{
        self.set_bottom_padding(0)
        self.fit_images()
        self.init_hyphenate()
+        self.initial_left_margin = unicode(self.javascript(
+                        'document.body.style.marginLeft').toString())
+        self.initial_right_margin = unicode(self.javascript(
+                        'document.body.style.marginRight').toString())
+        if self.in_fullscreen_mode:
+            self.switch_to_fullscreen_mode()
+
+    def switch_to_fullscreen_mode(self):
+        self.in_fullscreen_mode = True
+        self.javascript('''
+                var s = document.body.style;
+                s.maxWidth = "%dpx";
+                s.marginLeft = "auto";
+                s.marginRight = "auto";
+            '''%self.max_fs_width)
+
+    def switch_to_window_mode(self):
+        self.in_fullscreen_mode = False
+        self.javascript('''
+                var s = document.body.style;
+                s.maxWidth = "none";
+                s.marginLeft = "%s";
+                s.marginRight = "%s";
+            '''%(self.initial_left_margin, self.initial_right_margin))

    @pyqtSignature("QString")
    def debug(self, msg):
        prints(msg)

-    @pyqtSignature('')
-    def window_resized(self):
-        self.resize_callback()
-
    def reference_mode(self, enable):
        self.javascript(('enter' if enable else 'leave')+'_reference_mode()')

@ -413,7 +432,7 @@ class Document(QWebPage): # {{{
    def scroll_fraction(self):
        def fget(self):
            try:
-                return float(self.ypos)/(self.height-self.window_height)
+                return abs(float(self.ypos)/(self.height-self.window_height))
            except ZeroDivisionError:
                return 0.
        def fset(self, val):
@ -485,7 +504,6 @@ class DocumentView(QWebView): # {{{
        self.initial_pos = 0.0
        self.to_bottom = False
        self.document = Document(self.shortcuts, parent=self,
-                resize_callback=self.viewport_resized,
                debug_javascript=debug_javascript)
        self.setPage(self.document)
        self.manager = None
@ -581,8 +599,8 @@ class DocumentView(QWebView): # {{{

    def config(self, parent=None):
        self.document.do_config(parent)
-        if self.manager is not None:
-            self.manager.set_max_width()
+        if self.document.in_fullscreen_mode:
+            self.document.switch_to_fullscreen_mode()
        self.setFocus(Qt.OtherFocusReason)

    def bookmark(self):
@ -602,6 +620,9 @@ class DocumentView(QWebView): # {{{
            menu.insertAction(list(menu.actions())[0], self.search_action)
        menu.addSeparator()
        menu.addAction(self.goto_location_action)
+        if self.document.in_fullscreen_mode and self.manager is not None:
+            menu.addSeparator()
+            menu.addAction(self.manager.toggle_toolbar_action)
        menu.exec_(ev.globalPos())

    def lookup(self, *args):
@ -688,6 +709,7 @@ class DocumentView(QWebView): # {{{
        if self.manager is not None:
            self.manager.load_started()
        self.loading_url = QUrl.fromLocalFile(path)
+        html = re.sub(ur'<\s*title\s*/\s*>', u'', html, flags=re.IGNORECASE)
        if has_svg:
            self.setContent(QByteArray(html.encode(path.encoding)), mt, QUrl.fromLocalFile(path))
        else:
@ -1001,13 +1023,9 @@ class DocumentView(QWebView): # {{{
        return handled

    def resizeEvent(self, event):
-        ret = QWebView.resizeEvent(self, event)
-        QTimer.singleShot(10, self.initialize_scrollbar)
-        return ret
-
-    def viewport_resized(self):
        if self.manager is not None:
-            self.manager.viewport_resized(self.scroll_fraction)
+            self.manager.viewport_resize_started(event)
+        return QWebView.resizeEvent(self, event)

    def event(self, ev):
        if ev.type() == ev.Gesture:
--- a/src/calibre/gui2/viewer/main.py
+++ b/src/calibre/gui2/viewer/main.py
@ -5,11 +5,11 @@ import traceback, os, sys, functools, collections, re
 from functools import partial
 from threading import Thread

-from PyQt4.Qt import QApplication, Qt, QIcon, QTimer, SIGNAL, QByteArray, \
-                     QDoubleSpinBox, QLabel, QTextBrowser, \
-                     QPainter, QBrush, QColor, QStandardItemModel, QPalette, \
-                     QStandardItem, QUrl, QRegExpValidator, QRegExp, QLineEdit, \
-                     QToolButton, QMenu, QInputDialog, QAction, QKeySequence
+from PyQt4.Qt import (QApplication, Qt, QIcon, QTimer, SIGNAL, QByteArray,
+        QSize, QDoubleSpinBox, QLabel, QTextBrowser, QPropertyAnimation,
+        QPainter, QBrush, QColor, QStandardItemModel, QPalette, QStandardItem,
+        QUrl, QRegExpValidator, QRegExp, QLineEdit, QToolButton, QMenu,
+        QInputDialog, QAction, QKeySequence)

 from calibre.gui2.viewer.main_ui import Ui_EbookViewer
 from calibre.gui2.viewer.printing import Printing
@ -55,8 +55,6 @@ class TOC(QStandardItemModel):
            self.appendRow(TOCItem(t))
        self.setHorizontalHeaderItem(0, QStandardItem(_('Table of Contents')))

-
-
 class Worker(Thread):

    def run(self):
@ -226,6 +224,10 @@ class EbookViewer(MainWindow, Ui_EbookViewer):
        self.toc.setVisible(False)
        self.action_quit = QAction(self)
        self.addAction(self.action_quit)
+        self.view_resized_timer = QTimer(self)
+        self.view_resized_timer.timeout.connect(self.viewport_resize_finished)
+        self.view_resized_timer.setSingleShot(True)
+        self.resize_in_progress = False
        qs = [Qt.CTRL+Qt.Key_Q]
        if isosx:
            qs += [Qt.CTRL+Qt.Key_W]
@ -266,6 +268,9 @@ class EbookViewer(MainWindow, Ui_EbookViewer):
        self.connect(self.action_full_screen, SIGNAL('triggered(bool)'),
                     self.toggle_fullscreen)
        self.action_full_screen.setShortcuts([Qt.Key_F11, Qt.CTRL+Qt.SHIFT+Qt.Key_F])
+        self.action_full_screen.setToolTip(_('Toggle full screen (%s)') %
+                _(' or ').join([unicode(x.toString(x.NativeText)) for x in
+                    self.action_full_screen.shortcuts()]))
        self.connect(self.action_back, SIGNAL('triggered(bool)'), self.back)
        self.connect(self.action_bookmark, SIGNAL('triggered(bool)'), self.bookmark)
        self.connect(self.action_forward, SIGNAL('triggered(bool)'), self.forward)
@ -292,6 +297,38 @@ class EbookViewer(MainWindow, Ui_EbookViewer):
        self.tool_bar2.setContextMenuPolicy(Qt.PreventContextMenu)
        self.tool_bar.widgetForAction(self.action_bookmark).setPopupMode(QToolButton.MenuButtonPopup)
        self.action_full_screen.setCheckable(True)
+        self.full_screen_label = QLabel('''
+                <center>
+                <h1>%s</h1>
+                <h3>%s</h3>
+                <h3>%s</h3>
+                </center>
+                '''%(_('Full screen mode'),
+                    _('Right click to show controls'),
+                    _('Press Esc to quit')),
+                    self)
+        self.full_screen_label.setVisible(False)
+        self.full_screen_label.setStyleSheet('''
+        QLabel {
+            text-align: center;
+            background-color: white;
+            color: black;
+            border-width: 1px;
+            border-style: solid;
+            border-radius: 20px;
+        }
+        ''')
+        self.window_mode_changed = None
+        self.toggle_toolbar_action = QAction(_('Show/hide controls'), self)
+        self.toggle_toolbar_action.triggered.connect(self.toggle_toolbars)
+        self.addAction(self.toggle_toolbar_action)
+        self.full_screen_label_anim = QPropertyAnimation(
+                self.full_screen_label, 'size')
+        self.esc_full_screen_action = a = QAction(self)
+        self.addAction(a)
+        a.setShortcut(Qt.Key_Escape)
+        a.setEnabled(False)
+        a.triggered.connect(self.action_full_screen.trigger)

        self.print_menu = QMenu()
        self.print_menu.addAction(QIcon(I('print-preview.png')), _('Print Preview'))
@ -299,7 +336,6 @@ class EbookViewer(MainWindow, Ui_EbookViewer):
        self.tool_bar.widgetForAction(self.action_print).setPopupMode(QToolButton.MenuButtonPopup)
        self.connect(self.action_print, SIGNAL("triggered(bool)"), partial(self.print_book, preview=False))
        self.connect(self.print_menu.actions()[0], SIGNAL("triggered(bool)"), partial(self.print_book, preview=True))
-        self.set_max_width()
        ca = self.view.copy_action
        ca.setShortcut(QKeySequence.Copy)
        self.addAction(ca)
@ -313,6 +349,13 @@ class EbookViewer(MainWindow, Ui_EbookViewer):
        w = self.tool_bar.widgetForAction(self.action_open_ebook)
        w.setPopupMode(QToolButton.MenuButtonPopup)

+        for x in ('tool_bar', 'tool_bar2'):
+            x = getattr(self, x)
+            for action in x.actions():
+                # So that the keyboard shortcuts for these actions will
+                # continue to function even when the toolbars are hidden
+                self.addAction(action)
+
        self.restore_state()

    def set_toc_visible(self, yes):
@ -338,9 +381,18 @@ class EbookViewer(MainWindow, Ui_EbookViewer):
                count += 1

    def closeEvent(self, e):
+        if self.isFullScreen():
+            self.action_full_screen.trigger()
+            e.ignore()
+            return
        self.save_state()
        return MainWindow.closeEvent(self, e)

+    def toggle_toolbars(self):
+        for x in ('tool_bar', 'tool_bar2'):
+            x = getattr(self, x)
+            x.setVisible(not x.isVisible())
+
    def save_state(self):
        state = bytearray(self.saveState(self.STATE_VERSION))
        vprefs['viewer_toolbar_state'] = state
@ -382,11 +434,6 @@ class EbookViewer(MainWindow, Ui_EbookViewer):
        self._lookup = None
        self.dictionary_view.setHtml(html)

-    def set_max_width(self):
-        from calibre.gui2.viewer.documentview import config
-        c = config().parse()
-        self.frame.setMaximumWidth(c.max_view_width)
-
    def get_remember_current_page_opt(self):
        from calibre.gui2.viewer.documentview import config
        c = config().parse()
@ -401,6 +448,58 @@ class EbookViewer(MainWindow, Ui_EbookViewer):
        else:
            self.showFullScreen()

+    def showFullScreen(self):
+        self.view.document.page_position.save()
+        self.window_mode_changed = 'fullscreen'
+        self.tool_bar.setVisible(False)
+        self.tool_bar2.setVisible(False)
+        self._original_frame_margins = (
+            self.centralwidget.layout().contentsMargins(),
+            self.frame.layout().contentsMargins())
+        self.frame.layout().setContentsMargins(0, 0, 0, 0)
+        self.centralwidget.layout().setContentsMargins(0, 0, 0, 0)
+
+        super(EbookViewer, self).showFullScreen()
+
+    def show_full_screen_label(self):
+        f = self.full_screen_label
+        self.esc_full_screen_action.setEnabled(True)
+        f.setVisible(True)
+        height = 200
+        width = int(0.7*self.view.width())
+        f.resize(width, height)
+        f.move((self.view.width() - width)//2, (self.view.height()-height)//2)
+        a = self.full_screen_label_anim
+        a.setDuration(500)
+        a.setStartValue(QSize(width, 0))
+        a.setEndValue(QSize(width, height))
+        a.start()
+        QTimer.singleShot(2750, self.full_screen_label.hide)
+        self.view.document.switch_to_fullscreen_mode()
+
+    def showNormal(self):
+        self.view.document.page_position.save()
+        self.window_mode_changed = 'normal'
+        self.esc_full_screen_action.setEnabled(False)
+        self.tool_bar.setVisible(True)
+        self.tool_bar2.setVisible(True)
+        self.full_screen_label.setVisible(False)
+        if hasattr(self, '_original_frame_margins'):
+            om = self._original_frame_margins
+            self.centralwidget.layout().setContentsMargins(om[0])
+            self.frame.layout().setContentsMargins(om[1])
+        super(EbookViewer, self).showNormal()
+
+    def handle_window_mode_toggle(self):
+        if self.window_mode_changed:
+            fs = self.window_mode_changed == 'fullscreen'
+            self.window_mode_changed = None
+            if fs:
+                self.show_full_screen_label()
+            else:
+                self.view.document.switch_to_window_mode()
+            self.view.document.page_position.restore()
+
    def goto(self, ref):
        if ref:
            tokens = ref.split('.')
@ -428,6 +527,10 @@ class EbookViewer(MainWindow, Ui_EbookViewer):
    def toc_clicked(self, index):
        item = self.toc_model.itemFromIndex(index)
        if item.abspath is not None:
+            if not os.path.exists(item.abspath):
+                return error_dialog(self, _('No such location'),
+                        _('The location pointed to by this item'
+                            ' does not exist.'), show=True)
            url = QUrl.fromLocalFile(item.abspath)
            if item.fragment:
                url.setFragment(item.fragment)
@ -595,16 +698,28 @@ class EbookViewer(MainWindow, Ui_EbookViewer):
        self.open_progress_indicator(_('Laying out %s')%self.current_title)
        self.view.load_path(path, pos=pos)

-    def viewport_resized(self, frac):
-        new_page = self.pos.value()
-        if self.current_page is not None:
-            try:
-                frac = float(new_page-self.current_page.start_page)/(self.current_page.pages-1)
-            except ZeroDivisionError:
-                frac = 0
-            self.view.scroll_to(frac, notify=False)
+    def viewport_resize_started(self, event):
+        if not self.resize_in_progress:
+            # First resize, so save the current page position
+            self.resize_in_progress = True
+            if not self.window_mode_changed:
+                # The special handling for window mode changed will already
+                # have saved page position, so only save it if this is not a
+                # mode change
+                self.view.document.page_position.save()
+
+        if self.resize_in_progress:
+            self.view_resized_timer.start(75)
+
+    def viewport_resize_finished(self):
+        # There hasn't been a resize event for some time
+        # restore the current page position.
+        self.resize_in_progress = False
+        if self.window_mode_changed:
+            # This resize is part of a window mode change, special case it
+            self.handle_window_mode_toggle()
        else:
-            self.set_page_number(frac)
+            self.view.document.page_position.restore()

    def close_progress_indicator(self):
        self.pi.stop()
--- a/src/calibre/gui2/viewer/position.py
+++ b/src/calibre/gui2/viewer/position.py
@ -57,12 +57,20 @@ class PagePosition(object):
        return ans

    def __enter__(self):
-        self._cpos = self.current_pos
+        self.save()

    def __exit__(self, *args):
+        self.restore()
+
+    def save(self):
+        self._cpos = self.current_pos
+
+    def restore(self):
+        if self._cpos is None: return
        if isinstance(self._cpos, (int, float)):
            self.document.scroll_fraction = self._cpos
        else:
            self.scroll_to_cfi(self._cpos)
        self._cpos = None

+
--- a/src/calibre/library/caches.py
+++ b/src/calibre/library/caches.py
@ -172,11 +172,14 @@ def force_to_bool(val):

 class CacheRow(list): # {{{

-    def __init__(self, db, composites, val):
+    def __init__(self, db, composites, val, series_col, series_sort_col):
        self.db = db
        self._composites = composites
        list.__init__(self, val)
        self._must_do = len(composites) > 0
+        self._series_col = series_col
+        self._series_sort_col = series_sort_col
+        self._series_sort = None

    def __getitem__(self, col):
        if self._must_do:
@ -191,12 +194,19 @@ class CacheRow(list): # {{{
            elif col in self._composites:
                is_comp = True
            if is_comp:
-                id = list.__getitem__(self, 0)
+                id_ = list.__getitem__(self, 0)
                self._must_do = False
-                mi = self.db.get_metadata(id, index_is_id=True,
+                mi = self.db.get_metadata(id_, index_is_id=True,
                                          get_user_categories=False)
                for c in self._composites:
                    self[c] =  mi.get(self._composites[c])
+        if col == self._series_sort_col and self._series_sort is None:
+            if self[self._series_col]:
+                self._series_sort = title_sort(self[self._series_col])
+                self[self._series_sort_col] = self._series_sort
+            else:
+                self._series_sort = ''
+                self[self._series_sort_col] = ''
        return list.__getitem__(self, col)

    def __getslice__(self, i, j):
@ -226,6 +236,8 @@ class ResultCache(SearchQueryParser): # {{{
        for key in field_metadata:
            if field_metadata[key]['datatype'] == 'composite':
                self.composites[field_metadata[key]['rec_index']] = key
+        self.series_col = field_metadata['series']['rec_index']
+        self.series_sort_col = field_metadata['series_sort']['rec_index']
        self._data = []
        self._map = self._map_filtered = []
        self.first_sort = True
@ -918,9 +930,11 @@ class ResultCache(SearchQueryParser): # {{{
        for id in ids:
            try:
                self._data[id] = CacheRow(db, self.composites,
-                        db.conn.get('SELECT * from meta2 WHERE id=?', (id,))[0])
+                        db.conn.get('SELECT * from meta2 WHERE id=?', (id,))[0],
+                        self.series_col, self.series_sort_col)
                self._data[id].append(db.book_on_device_string(id))
                self._data[id].append(self.marked_ids_dict.get(id, None))
+                self._data[id].append(None)
            except IndexError:
                return None
        try:
@ -935,9 +949,11 @@ class ResultCache(SearchQueryParser): # {{{
        self._data.extend(repeat(None, max(ids)-len(self._data)+2))
        for id in ids:
            self._data[id] = CacheRow(db, self.composites,
-                        db.conn.get('SELECT * from meta2 WHERE id=?', (id,))[0])
+                        db.conn.get('SELECT * from meta2 WHERE id=?', (id,))[0],
+                        self.series_col, self.series_sort_col)
            self._data[id].append(db.book_on_device_string(id))
            self._data[id].append(self.marked_ids_dict.get(id, None))
+            self._data[id].append(None) # Series sort column
        self._map[0:0] = ids
        self._map_filtered[0:0] = ids

@ -962,11 +978,13 @@ class ResultCache(SearchQueryParser): # {{{
        temp = db.conn.get('SELECT * FROM meta2')
        self._data = list(itertools.repeat(None, temp[-1][0]+2)) if temp else []
        for r in temp:
-            self._data[r[0]] = CacheRow(db, self.composites, r)
+            self._data[r[0]] = CacheRow(db, self.composites, r,
+                                        self.series_col, self.series_sort_col)
        for item in self._data:
            if item is not None:
                item.append(db.book_on_device_string(item[0]))
-                item.append(None)
+                # Temp mark and series_sort columns
+                item.extend((None, None))

        marked_col = self.FIELD_MAP['marked']
        for id_,val in self.marked_ids_dict.iteritems():
--- a/src/calibre/library/check_library.py
+++ b/src/calibre/library/check_library.py
@ -47,8 +47,8 @@ class CheckLibrary(object):
        self.is_case_sensitive = db.is_case_sensitive

        self.all_authors = frozenset([x[1] for x in db.all_authors()])
-        self.all_ids = frozenset([id for id in db.all_ids()])
-        self.all_dbpaths = frozenset(self.dbpath(id) for id in self.all_ids)
+        self.all_ids = frozenset([id_ for id_ in db.all_ids()])
+        self.all_dbpaths = frozenset(self.dbpath(id_) for id_ in self.all_ids)
        self.all_lc_dbpaths = frozenset([f.lower() for f in self.all_dbpaths])

        self.db_id_regexp = re.compile(r'^.* \((\d+)\)$')
@ -73,8 +73,8 @@ class CheckLibrary(object):

        self.failed_folders = []

-    def dbpath(self, id):
-        return self.db.path(id, index_is_id=True)
+    def dbpath(self, id_):
+        return self.db.path(id_, index_is_id=True)

    @property
    def errors_occurred(self):
@ -116,21 +116,21 @@ class CheckLibrary(object):
                    self.invalid_titles.append((auth_dir, db_path, 0))
                    continue

-                id = m.group(1)
-                # Third check: the id must be in the DB and the paths must match
+                id_ = m.group(1)
+                # Third check: the id_ must be in the DB and the paths must match
                if self.is_case_sensitive:
-                    if int(id) not in self.all_ids or \
+                    if int(id_) not in self.all_ids or \
                            db_path not in self.all_dbpaths:
                        self.extra_titles.append((title_dir, db_path, 0))
                        continue
                else:
-                    if int(id) not in self.all_ids or \
+                    if int(id_) not in self.all_ids or \
                            db_path.lower() not in self.all_lc_dbpaths:
                        self.extra_titles.append((title_dir, db_path, 0))
                        continue

                # Record the book to check its formats
-                self.book_dirs.append((db_path, title_dir, id))
+                self.book_dirs.append((db_path, title_dir, id_))
                found_titles = True

            # Fourth check: author directories that contain no titles
@ -145,6 +145,21 @@ class CheckLibrary(object):
                # Sort-of check: exception processing directory
                self.failed_folders.append((title_path, traceback.format_exc(), []))

+        # Check for formats and covers in db for book dirs that are gone
+        for id_ in self.all_ids:
+            path = self.dbpath(id_)
+            if not os.path.exists(os.path.join(lib, path)):
+                title_dir = os.path.basename(path)
+                book_formats = frozenset([x for x in
+                            self.db.format_files(id_, index_is_id=True)])
+                for fmt in book_formats:
+                    self.missing_formats.append((title_dir,
+                            os.path.join(path, fmt[0]+'.'+fmt[1].lower()), id_))
+                if self.db.has_cover(id_):
+                    self.missing_covers.append((title_dir,
+                            os.path.join(path, 'cover.jpg'), id_))
+
+
    def is_ebook_file(self, filename):
        ext = os.path.splitext(filename)[1]
        if not ext:
@ -226,8 +241,8 @@ class CheckLibrary(object):
        if self.db.has_cover(book_id):
            if 'cover.jpg' not in filenames:
                self.missing_covers.append((title_dir,
-                        os.path.join(db_path, title_dir, 'cover.jpg'), book_id))
+                        os.path.join(db_path, 'cover.jpg'), book_id))
        else:
            if 'cover.jpg' in filenames:
                self.extra_covers.append((title_dir,
-                        os.path.join(db_path, title_dir, 'cover.jpg'), book_id))
+                        os.path.join(db_path, 'cover.jpg'), book_id))
--- a/src/calibre/library/cli.py
+++ b/src/calibre/library/cli.py
@ -204,7 +204,8 @@ class DevNull(object):
        pass
 NULL = DevNull()

-def do_add(db, paths, one_book_per_directory, recurse, add_duplicates):
+def do_add(db, paths, one_book_per_directory, recurse, add_duplicates, otitle,
+        oauthors, oisbn, otags, oseries, oseries_index):
    orig = sys.stdout
    #sys.stdout = NULL
    try:
@ -231,6 +232,11 @@ def do_add(db, paths, one_book_per_directory, recurse, add_duplicates):
                mi.title = os.path.splitext(os.path.basename(book))[0]
            if not mi.authors:
                mi.authors = [_('Unknown')]
+            for x in ('title', 'authors', 'isbn', 'tags', 'series'):
+                val = locals()[x]
+                if val: setattr(mi, x[1:], val)
+            if oseries:
+                mi.series_index = oseries_index

            formats.append(format)
            metadata.append(mi)
@ -302,39 +308,56 @@ the directory related options below.
    parser.add_option('-e', '--empty', action='store_true', default=False,
                    help=_('Add an empty book (a book with no formats)'))
    parser.add_option('-t', '--title', default=None,
-            help=_('Set the title of the added empty book'))
+            help=_('Set the title of the added book(s)'))
    parser.add_option('-a', '--authors', default=None,
-            help=_('Set the authors of the added empty book'))
+            help=_('Set the authors of the added book(s)'))
    parser.add_option('-i', '--isbn', default=None,
-            help=_('Set the ISBN of the added empty book'))
+            help=_('Set the ISBN of the added book(s)'))
+    parser.add_option('-T', '--tags', default=None,
+            help=_('Set the tags of the added book(s)'))
+    parser.add_option('-s', '--series', default=None,
+            help=_('Set the series of the added book(s)'))
+    parser.add_option('-S', '--series-index', default=1.0, type=float,
+            help=_('Set the series number of the added book(s)'))
+

    return parser

-def do_add_empty(db, title, authors, isbn):
-    from calibre.ebooks.metadata import MetaInformation, string_to_authors
+def do_add_empty(db, title, authors, isbn, tags, series, series_index):
+    from calibre.ebooks.metadata import MetaInformation
    mi = MetaInformation(None)
    if title is not None:
        mi.title = title
    if authors:
-        mi.authors = string_to_authors(authors)
+        mi.authors = authors
    if isbn:
        mi.isbn = isbn
+    if tags:
+        mi.tags = tags
+    if series:
+        mi.series, mi.series_index = series, series_index
    db.import_book(mi, [])
    write_dirtied(db)
    send_message()

 def command_add(args, dbpath):
+    from calibre.ebooks.metadata import string_to_authors
    parser = add_option_parser()
    opts, args = parser.parse_args(sys.argv[:1] + args)
+    aut = string_to_authors(opts.authors) if opts.authors else []
+    tags = [x.strip() for x in opts.tags.split(',')] if opts.tags else []
    if opts.empty:
-        do_add_empty(get_db(dbpath, opts), opts.title, opts.authors, opts.isbn)
+        do_add_empty(get_db(dbpath, opts), opts.title, aut, opts.isbn, tags,
+                opts.series, opts.series_index)
        return 0
    if len(args) < 2:
        parser.print_help()
        print
        print >>sys.stderr, _('You must specify at least one file to add')
        return 1
-    do_add(get_db(dbpath, opts), args[1:], opts.one_book_per_directory, opts.recurse, opts.duplicates)
+    do_add(get_db(dbpath, opts), args[1:], opts.one_book_per_directory,
+            opts.recurse, opts.duplicates, opts.title, opts.author, opts.isbn,
+            tags, opts.series, opts.series_index)
    return 0

 def do_remove(db, ids):
--- a/src/calibre/library/database2.py
+++ b/src/calibre/library/database2.py
@ -434,6 +434,8 @@ class LibraryDatabase2(LibraryDatabase, SchemaUpgrade, CustomColumns):
        self.field_metadata.set_field_record_index('ondevice', base, prefer_custom=False)
        self.FIELD_MAP['marked'] = base = base+1
        self.field_metadata.set_field_record_index('marked', base, prefer_custom=False)
+        self.FIELD_MAP['series_sort'] = base = base+1
+        self.field_metadata.set_field_record_index('series_sort', base, prefer_custom=False)

        script = '''
        DROP VIEW IF EXISTS meta2;
--- a/src/calibre/library/field_metadata.py
+++ b/src/calibre/library/field_metadata.py
@ -327,6 +327,16 @@ class FieldMetadata(dict):
                             'is_custom':False,
                             'is_category':False,
                           'is_csp': False}),
+            ('series_sort',  {'table':None,
+                           'column':None,
+                           'datatype':'text',
+                           'is_multiple':{},
+                           'kind':'field',
+                           'name':_('Series Sort'),
+                           'search_terms':['series_sort'],
+                           'is_custom':False,
+                           'is_category':False,
+                           'is_csp': False}),
            ('sort',      {'table':None,
                           'column':None,
                           'datatype':'text',
--- a/src/calibre/manual/template_lang.rst
+++ b/src/calibre/manual/template_lang.rst
@ -298,6 +298,7 @@ The following functions are available in addition to those described in single-f
    * ``or(value, value, ...)`` -- returns the string "1" if any value is not empty, otherwise returns the empty string. This function works well with test or first_non_empty. You can have as many values as you want.
    * ``print(a, b, ...)`` -- prints the arguments to standard output. Unless you start calibre from the command line (``calibre-debug -g``), the output will go to a black hole.
    * ``raw_field(name)`` -- returns the metadata field named by name without applying any formatting.
+    * ``series_sort()`` -- returns the series sort value.
    * ``strcat(a, b, ...)`` -- can take any number of arguments. Returns a string formed by concatenating all the arguments.
    * ``strcat_max(max, string1, prefix2, string2, ...)`` -- Returns a string formed by concatenating the arguments. The returned value is initialized to string1. `Prefix, string` pairs are added to the end of the value as long as the resulting string length is less than `max`. String1 is returned even if string1 is longer than max. You can pass as many `prefix, string` pairs as you wish.
    * ``strcmp(x, y, lt, eq, gt)`` -- does a case-insensitive comparison x and y as strings. Returns ``lt`` if x < y. Returns ``eq`` if x == y. Otherwise returns ``gt``.
--- a/src/calibre/translations/af.po
+++ b/src/calibre/translations/af.po
--- a/src/calibre/translations/ar.po
+++ b/src/calibre/translations/ar.po
--- a/src/calibre/translations/ast.po
+++ b/src/calibre/translations/ast.po
--- a/src/calibre/translations/az.po
+++ b/src/calibre/translations/az.po
--- a/src/calibre/translations/bg.po
+++ b/src/calibre/translations/bg.po
--- a/src/calibre/translations/bn.po
+++ b/src/calibre/translations/bn.po
--- a/src/calibre/translations/br.po
+++ b/src/calibre/translations/br.po
--- a/src/calibre/translations/bs.po
+++ b/src/calibre/translations/bs.po
--- a/src/calibre/translations/ca.po
+++ b/src/calibre/translations/ca.po
--- a/src/calibre/translations/calibre.pot
+++ b/src/calibre/translations/calibre.pot
--- a/src/calibre/translations/cs.po
+++ b/src/calibre/translations/cs.po
--- a/src/calibre/translations/cy.po
+++ b/src/calibre/translations/cy.po
--- a/src/calibre/translations/da.po
+++ b/src/calibre/translations/da.po
--- a/src/calibre/translations/de.po
+++ b/src/calibre/translations/de.po
--- a/src/calibre/translations/el.po
+++ b/src/calibre/translations/el.po
--- a/src/calibre/translations/en_AU.po
+++ b/src/calibre/translations/en_AU.po
--- a/src/calibre/translations/en_CA.po
+++ b/src/calibre/translations/en_CA.po
--- a/src/calibre/translations/en_GB.po
+++ b/src/calibre/translations/en_GB.po
--- a/Show More
+++ b/Show More