This commit is contained in:
GRiker 2013-04-12 09:08:45 -06:00
commit 78c50c3d4c
163 changed files with 34168 additions and 22698 deletions

View File

@ -1,4 +1,4 @@
# vim:fileencoding=UTF-8:ts=2:sw=2:sta:et:sts=2:ai # vim:fileencoding=utf-8:ts=2:sw=2:sta:et:sts=2:ai
# Each release can have new features and bug fixes. Each of which # Each release can have new features and bug fixes. Each of which
# must have a title and can optionally have linked tickets and a description. # must have a title and can optionally have linked tickets and a description.
# In addition they can have a type field which defaults to minor, but should be major # In addition they can have a type field which defaults to minor, but should be major
@ -20,6 +20,66 @@
# new recipes: # new recipes:
# - title: # - title:
- version: 0.9.27
date: 2013-04-12
new features:
- title: "Metadata download: Add two new sources for covers: Google Image Search and bigbooksearch.com."
description: "To enable them go to Preferences->Metadata download and enable the 'Google Image' and 'Big Book Search' sources. Google Images is useful for finding larger covers as well as alternate versions of the cover. Big Book Search searches for alternate covers from amazon.com. It can occasionally find nicer covers than the direct Amazon source. Note that both these sources download multiple covers for a single book. Some of these covers can be wrong (i.e. they may be of a different book or not covers at all, so you should inspect the results and manually pick the best match). When bulk downloading, these sources are only used if the other sources find no covers."
type: major
- title: "Content server: Allow specifying a restriction to use for the server when embedding it as a WSGI app."
tickets: [1167951]
- title: "Get Books: Add a plugin for the Koobe Polish book store"
- title: "calibredb add_format: Add an option to not replace existing formats. Also pep8 compliance."
- title: "Allow restoring of the ORIGINAL_XXX format by right-clicking it in the book details panel"
bug fixes:
- title: "AZW3 Input: Do not fail to identify JPEG images with 8BIM headers created with Adobe Photoshop."
tickets: [1167985]
- title: "Amazon metadata download: Ignore Spanish edition entries when searching for a book on amazon.com"
- title: "TXT Input: When converting a txt file with a Byte Order Mark, remove the Byte Order Mark before further processing as it can cause the first line of the text to be mis-interpreted."
- title: "Get Books: Fix searching for current book/title/author by right clicking the get books icon"
- title: "Get Books: Update nexto, gutenberg, and virtualo store plugins for website changes"
- title: "Amazon metadata download: When downloading from amazon.co.jp handle the 'Black curtain redirect' for adult titles."
tickets: [1165628]
- title: "When extracting zip files do not allow maliciously created zip files to overwrite other files on the system"
- title: "RTF Input: Handle RTF files with invalid border style specifications"
tickets: [1021270]
improved recipes:
- The Escapist
- San Francisco Chronicle
- The Onion
- Fronda
- Tom's Hardware
- New Yorker
- Financial Times UK
- Business Week Magazine
- Victoria Times
- tvxs
- The Independent
new recipes:
- title: Economia
author: Manish Bhattarai
- title: Universe Today
author: seird
- title: The Galaxy's Edge
author: Krittika Goyal
- version: 0.9.26 - version: 0.9.26
date: 2013-04-05 date: 2013-04-05

View File

@ -436,8 +436,8 @@ generate a Table of Contents in the converted ebook, based on the actual content
.. note:: Using these options can be a little challenging to get exactly right. .. note:: Using these options can be a little challenging to get exactly right.
If you prefer creating/editing the Table of Contents by hand, convert to If you prefer creating/editing the Table of Contents by hand, convert to
the EPUB or AZW3 formats and select the checkbox at the bottom of the the EPUB or AZW3 formats and select the checkbox at the bottom of the Table
screen that says of Contents section of the conversion dialog that says
:guilabel:`Manually fine-tune the Table of Contents after conversion`. :guilabel:`Manually fine-tune the Table of Contents after conversion`.
This will launch the ToC Editor tool after the conversion. It allows you to This will launch the ToC Editor tool after the conversion. It allows you to
create entries in the Table of Contents by simply clicking the place in the create entries in the Table of Contents by simply clicking the place in the

View File

@ -647,12 +647,17 @@ computers. Run |app| on a single computer and access it via the Content Server
or a Remote Desktop solution. or a Remote Desktop solution.
If you must share the actual library, use a file syncing tool like If you must share the actual library, use a file syncing tool like
DropBox or rsync or Microsoft SkyDrive instead of a networked drive. Even with DropBox or rsync or Microsoft SkyDrive instead of a networked drive. If you are
these tools there is danger of data corruption/loss, so only do this if you are using a file-syncing tool it is **essential** that you make sure that both
willing to live with that risk. In particular, be aware that **Google Drive** |app| and the file syncing tool do not try to access the |app| library at the
is incompatible with |app|, if you put your |app| library in Google Drive, you same time. In other words, **do not** run the file syncing tool and |app| at
*will* suffer data loss. See the same time.
`this thread <http://www.mobileread.com/forums/showthread.php?t=205581>`_ for details.
Even with these tools there is danger of data corruption/loss, so only do this
if you are willing to live with that risk. In particular, be aware that
**Google Drive** is incompatible with |app|, if you put your |app| library in
Google Drive, **you will suffer data loss**. See `this thread
<http://www.mobileread.com/forums/showthread.php?t=205581>`_ for details.
Content From The Web Content From The Web
--------------------- ---------------------
@ -797,6 +802,12 @@ Downloading from the Internet can sometimes result in a corrupted download. If t
* Try temporarily disabling your antivirus program (Microsoft Security Essentials, or Kaspersky or Norton or McAfee or whatever). This is most likely the culprit if the upgrade process is hanging in the middle. * Try temporarily disabling your antivirus program (Microsoft Security Essentials, or Kaspersky or Norton or McAfee or whatever). This is most likely the culprit if the upgrade process is hanging in the middle.
* Try rebooting your computer and running a registry cleaner like `Wise registry cleaner <http://www.wisecleaner.com>`_. * Try rebooting your computer and running a registry cleaner like `Wise registry cleaner <http://www.wisecleaner.com>`_.
* Try downloading the installer with an alternate browser. For example if you are using Internet Explorer, try using Firefox or Chrome instead. * Try downloading the installer with an alternate browser. For example if you are using Internet Explorer, try using Firefox or Chrome instead.
* If you get an error about a missing DLL on windows, then most likely, the
permissions on your temporary folder are incorrect. Go to the folder
:file:`C:\\Users\\USERNAME\\AppData\\Local` in Windows explorer and then
right click on the :file:`Temp` folder and select :guilabel:`Properties` and go to
the :guilabel:`Security` tab. Make sure that your user account has full control
for this folder.
If you still cannot get the installer to work and you are on windows, you can use the `calibre portable install <http://calibre-ebook.com/download_portable>`_, which does not need an installer (it is just a zip file). If you still cannot get the installer to work and you are on windows, you can use the `calibre portable install <http://calibre-ebook.com/download_portable>`_, which does not need an installer (it is just a zip file).

View File

@ -91,7 +91,11 @@ First, we have to create a WSGI *adapter* for the calibre content server. Here i
# Path to the calibre library to be served # Path to the calibre library to be served
# The server process must have write permission for all files/dirs # The server process must have write permission for all files/dirs
# in this directory or BAD things will happen # in this directory or BAD things will happen
path_to_library='/home/kovid/documents/demo library' path_to_library='/home/kovid/documents/demo library',
# The virtual library (restriction) to be used when serving this
# library.
virtual_library=None
) )
del create_wsgi_app del create_wsgi_app

View File

@ -1,3 +1,4 @@
import re
from calibre.web.feeds.recipes import BasicNewsRecipe from calibre.web.feeds.recipes import BasicNewsRecipe
from collections import OrderedDict from collections import OrderedDict
@ -39,7 +40,7 @@ class BusinessWeekMagazine(BasicNewsRecipe):
title=self.tag_to_string(div.a).strip() title=self.tag_to_string(div.a).strip()
url=div.a['href'] url=div.a['href']
soup0 = self.index_to_soup(url) soup0 = self.index_to_soup(url)
urlprint=soup0.find('li', attrs={'class':'print tracked'}).a['href'] urlprint=soup0.find('a', attrs={'href':re.compile('.*printer.*')})['href']
articles.append({'title':title, 'url':urlprint, 'description':'', 'date':''}) articles.append({'title':title, 'url':urlprint, 'description':'', 'date':''})
@ -56,7 +57,7 @@ class BusinessWeekMagazine(BasicNewsRecipe):
title=self.tag_to_string(div.a).strip() title=self.tag_to_string(div.a).strip()
url=div.a['href'] url=div.a['href']
soup0 = self.index_to_soup(url) soup0 = self.index_to_soup(url)
urlprint=soup0.find('li', attrs={'class':'print tracked'}).a['href'] urlprint=soup0.find('a', attrs={'href':re.compile('.*printer.*')})['href']
articles.append({'title':title, 'url':urlprint, 'description':desc, 'date':''}) articles.append({'title':title, 'url':urlprint, 'description':desc, 'date':''})
if articles: if articles:

17
recipes/economia.recipe Normal file
View File

@ -0,0 +1,17 @@
from calibre.web.feeds.news import BasicNewsRecipe
class AdvancedUserRecipe1314326622(BasicNewsRecipe):
title = u'Economia'
__author__ = 'Manish Bhattarai'
description = 'Economia - Intelligence & Insight for ICAEW Members'
language = 'en_GB'
oldest_article = 7
max_articles_per_feed = 25
masthead_url = 'http://economia.icaew.com/~/media/Images/Design%20Images/Economia_Red_website.ashx'
cover_url = 'http://economia.icaew.com/~/media/Images/Design%20Images/Economia_Red_website.ashx'
no_stylesheets = True
remove_empty_feeds = True
remove_tags_before = dict(id='content')
remove_tags_after = dict(id='stars-wrapper')
remove_tags = [dict(attrs={'class':['floatR', 'sharethis', 'rating clearfix']})]
feeds = [(u'News', u'http://feedity.com/icaew-com/VlNTVFRa.rss'),(u'Business', u'http://feedity.com/icaew-com/VlNTVFtS.rss'),(u'People', u'http://feedity.com/icaew-com/VlNTVFtX.rss'),(u'Opinion', u'http://feedity.com/icaew-com/VlNTVFtW.rss'),(u'Finance', u'http://feedity.com/icaew-com/VlNTVFtV.rss')]

View File

@ -110,10 +110,12 @@ class FinancialTimes(BasicNewsRecipe):
soup = self.index_to_soup(self.INDEX) soup = self.index_to_soup(self.INDEX)
#dates= self.tag_to_string(soup.find('div', attrs={'class':'btm-links'}).find('div')) #dates= self.tag_to_string(soup.find('div', attrs={'class':'btm-links'}).find('div'))
#self.timefmt = ' [%s]'%dates #self.timefmt = ' [%s]'%dates
section_title = 'Untitled'
for column in soup.findAll('div', attrs = {'class':'feedBoxes clearfix'}): for column in soup.findAll('div', attrs = {'class':'feedBoxes clearfix'}):
for section in column. findAll('div', attrs = {'class':'feedBox'}): for section in column. findAll('div', attrs = {'class':'feedBox'}):
section_title=self.tag_to_string(section.find('h4')) sectiontitle=self.tag_to_string(section.find('h4'))
if '...' not in sectiontitle: section_title=sectiontitle
for article in section.ul.findAll('li'): for article in section.ul.findAll('li'):
articles = [] articles = []
title=self.tag_to_string(article.a) title=self.tag_to_string(article.a)

View File

@ -6,6 +6,7 @@ __copyright__ = u'2010-2013, Tomasz Dlugosz <tomek3d@gmail.com>'
fronda.pl fronda.pl
''' '''
import re
from calibre.web.feeds.news import BasicNewsRecipe from calibre.web.feeds.news import BasicNewsRecipe
from datetime import timedelta, date from datetime import timedelta, date
@ -23,6 +24,7 @@ class Fronda(BasicNewsRecipe):
extra_css = ''' extra_css = '''
h1 {font-size:150%} h1 {font-size:150%}
.body {text-align:left;} .body {text-align:left;}
div#featured-image {font-style:italic; font-size:70%}
''' '''
earliest_date = date.today() - timedelta(days=oldest_article) earliest_date = date.today() - timedelta(days=oldest_article)
@ -55,7 +57,10 @@ class Fronda(BasicNewsRecipe):
articles = {} articles = {}
for url, genName in genres: for url, genName in genres:
soup = self.index_to_soup('http://www.fronda.pl/c/'+ url) try:
soup = self.index_to_soup('http://www.fronda.pl/c/'+ url)
except:
continue
articles[genName] = [] articles[genName] = []
for item in soup.findAll('li'): for item in soup.findAll('li'):
article_h = item.find('h2') article_h = item.find('h2')
@ -77,16 +82,15 @@ class Fronda(BasicNewsRecipe):
] ]
remove_tags = [ remove_tags = [
dict(name='div', attrs={'class':['related-articles', dict(name='div', attrs={'class':['related-articles','button right','pagination','related-articles content']}),
'button right',
'pagination']}),
dict(name='h3', attrs={'class':'block-header article comments'}), dict(name='h3', attrs={'class':'block-header article comments'}),
dict(name='ul', attrs={'class':'comment-list'}), dict(name='ul', attrs={'class':['comment-list','category','tag-list']}),
dict(name='ul', attrs={'class':'category'}),
dict(name='ul', attrs={'class':'tag-list'}),
dict(name='p', attrs={'id':'comments-disclaimer'}), dict(name='p', attrs={'id':'comments-disclaimer'}),
dict(name='div', attrs={'style':'text-align: left; margin-bottom: 15px;'}), dict(name='div', attrs={'style':'text-align: left; margin-bottom: 15px;'}),
dict(name='div', attrs={'style':'text-align: left; margin-top: 15px; margin-bottom: 30px;'}), dict(name='div', attrs={'style':'text-align: left; margin-top: 15px; margin-bottom: 30px;'}),
dict(name='div', attrs={'class':'related-articles content'}), dict(name='div', attrs={'id':'comment-form'}),
dict(name='div', attrs={'id':'comment-form'}) dict(name='span', attrs={'class':'separator'})
] ]
preprocess_regexps = [
(re.compile(r'komentarzy: .*?</h6>', re.IGNORECASE | re.DOTALL | re.M ), lambda match: '</h6>')]

108
recipes/galaxys_edge.recipe Normal file
View File

@ -0,0 +1,108 @@
from __future__ import with_statement
__license__ = 'GPL 3'
__copyright__ = '2009, Kovid Goyal <kovid@kovidgoyal.net>'
from calibre.web.feeds.news import BasicNewsRecipe
class GalaxyEdge(BasicNewsRecipe):
title = u'The Galaxy\'s Edge'
language = 'en'
oldest_article = 7
__author__ = 'Krittika Goyal'
no_stylesheets = True
auto_cleanup = True
#keep_only_tags = [dict(id='content')]
#remove_tags = [dict(attrs={'class':['article-links', 'breadcr']}),
#dict(id=['email-section', 'right-column', 'printfooter', 'topover',
#'slidebox', 'th_footer'])]
extra_css = '.photo-caption { font-size: smaller }'
def parse_index(self):
soup = self.index_to_soup('http://www.galaxysedge.com/')
main = soup.find('table', attrs={'width':'911'})
toc = main.find('td', attrs={'width':'225'})
current_section = None
current_articles = []
feeds = []
c = 0
for x in toc.findAll(['p']):
c = c+1
if c == 5:
if current_articles and current_section:
feeds.append((current_section, current_articles))
edwo = x.find('a')
current_section = self.tag_to_string(edwo)
current_articles = []
self.log('\tFound section:', current_section)
title = self.tag_to_string(edwo)
url = edwo.get('href', True)
url = 'http://www.galaxysedge.com/'+url
print(title)
print(c)
if not url or not title:
continue
self.log('\t\tFound article:', title)
self.log('\t\t\t', url)
current_articles.append({'title': title, 'url':url,
'description':'', 'date':''})
elif c>5:
current_section = self.tag_to_string(x.find('b'))
current_articles = []
self.log('\tFound section:', current_section)
for y in x.findAll('a'):
title = self.tag_to_string(y)
url = y.get('href', True)
url = 'http://www.galaxysedge.com/'+url
print(title)
if not url or not title:
continue
self.log('\t\tFound article:', title)
self.log('\t\t\t', url)
current_articles.append({'title': title, 'url':url,
'description':'', 'date':''})
if current_articles and current_section:
feeds.append((current_section, current_articles))
return feeds
#def preprocess_raw_html(self, raw, url):
#return raw.replace('<body><p>', '<p>').replace('</p></body>', '</p>')
#def postprocess_html(self, soup, first_fetch):
#for t in soup.findAll(['table', 'tr', 'td','center']):
#t.name = 'div'
#return soup
#def parse_index(self):
#today = time.strftime('%Y-%m-%d')
#soup = self.index_to_soup(
#'http://www.thehindu.com/todays-paper/tp-index/?date=' + today)
#div = soup.find(id='left-column')
#feeds = []
#current_section = None
#current_articles = []
#for x in div.findAll(['h3', 'div']):
#if current_section and x.get('class', '') == 'tpaper':
#a = x.find('a', href=True)
#if a is not None:
#current_articles.append({'url':a['href']+'?css=print',
#'title':self.tag_to_string(a), 'date': '',
#'description':''})
#if x.name == 'h3':
#if current_section and current_articles:
#feeds.append((current_section, current_articles))
#current_section = self.tag_to_string(x)
#current_articles = []
#return feeds

Binary file not shown.

After

Width:  |  Height:  |  Size: 905 B

View File

@ -41,6 +41,7 @@ class TheIndependentNew(BasicNewsRecipe):
publication_type = 'newspaper' publication_type = 'newspaper'
masthead_url = 'http://www.independent.co.uk/independent.co.uk/editorial/logo/independent_Masthead.png' masthead_url = 'http://www.independent.co.uk/independent.co.uk/editorial/logo/independent_Masthead.png'
encoding = 'utf-8' encoding = 'utf-8'
compress_news_images = True
remove_tags =[ remove_tags =[
dict(attrs={'id' : ['RelatedArtTag','renderBiography']}), dict(attrs={'id' : ['RelatedArtTag','renderBiography']}),
dict(attrs={'class' : ['autoplay','openBiogPopup']}), dict(attrs={'class' : ['autoplay','openBiogPopup']}),
@ -343,7 +344,7 @@ class TheIndependentNew(BasicNewsRecipe):
if 'class' in subtitle_div: if 'class' in subtitle_div:
clazz = subtitle_div['class'] + ' ' clazz = subtitle_div['class'] + ' '
clazz = clazz + 'subtitle' clazz = clazz + 'subtitle'
subtitle_div['class'] = clazz subtitle_div['class'] = clazz
#find broken images and remove captions #find broken images and remove captions
items_to_extract = [] items_to_extract = []

View File

@ -1,64 +1,44 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
__license__ = 'GPL v3' __license__ = 'GPL v3'
__copyright__ = '2008-2013, Darko Miletic <darko.miletic at gmail.com>'
'''
newyorker.com
'''
'''
www.canada.com
'''
import re
from calibre.web.feeds.news import BasicNewsRecipe from calibre.web.feeds.news import BasicNewsRecipe
from calibre.ebooks.BeautifulSoup import BeautifulStoneSoup
class NewYorker(BasicNewsRecipe): class NewYorker(BasicNewsRecipe):
title = 'The New Yorker'
__author__ = 'Darko Miletic'
description = 'The best of US journalism'
oldest_article = 15
language = 'en'
max_articles_per_feed = 100
no_stylesheets = True
use_embedded_content = False
publisher = 'Conde Nast Publications'
category = 'news, politics, USA'
encoding = 'cp1252'
publication_type = 'magazine'
masthead_url = 'http://www.newyorker.com/css/i/hed/logo.gif'
extra_css = """
body {font-family: "Times New Roman",Times,serif}
.articleauthor{color: #9F9F9F;
font-family: Arial, sans-serif;
font-size: small;
text-transform: uppercase}
.rubric,.dd,h6#credit{color: #CD0021;
font-family: Arial, sans-serif;
font-size: small;
text-transform: uppercase}
.descender:first-letter{display: inline; font-size: xx-large; font-weight: bold}
.dd,h6#credit{color: gray}
.c{display: block}
.caption,h2#articleintro{font-style: italic}
.caption{font-size: small}
"""
conversion_options = {
'comment' : description
, 'tags' : category
, 'publisher' : publisher
, 'language' : language
}
keep_only_tags = [dict(name='div', attrs={'id':'pagebody'})] title = u'New Yorker Magazine'
remove_tags = [ newyorker_prefix = 'http://m.newyorker.com'
dict(name=['meta','iframe','base','link','embed','object']) description = u'Content from the New Yorker website'
,dict(attrs={'class':['utils','socialUtils','articleRailLinks','icons','social-utils-top','entry-keywords','entry-categories','utilsPrintEmail'] }) fp_tag = 'CAN_TC'
,dict(attrs={'id':['show-header','show-footer'] })
]
remove_tags_after = dict(attrs={'class':'entry-content'})
remove_attributes = ['lang']
feeds = [(u'The New Yorker', u'http://www.newyorker.com/services/mrss/feeds/everything.xml')]
def print_version(self, url): masthead_url = 'http://www.newyorker.com/images/elements/print/newyorker_printlogo.gif'
return url + '?printable=true&currentPage=all'
def image_url_processor(self, baseurl, url): compress_news_images = True
return url.strip() compress_news_images_auto_size = 8
scale_news_images_to_device = False
scale_news_images = (768, 1024)
url_list = []
language = 'en'
__author__ = 'Nick Redding'
no_stylesheets = True
timefmt = ' [%b %d]'
encoding = 'utf-8'
extra_css = '''
.byline { font-size:xx-small; font-weight: bold;}
h3 { margin-bottom: 6px; }
.caption { font-size: xx-small; font-style: italic; font-weight: normal; }
'''
keep_only_tags = [dict(name='div', attrs={'id':re.compile('pagebody')})]
remove_tags = [{'class':'socialUtils'},{'class':'entry-keywords'}]
def get_cover_url(self): def get_cover_url(self):
cover_url = "http://www.newyorker.com/images/covers/1925/1925_02_21_p233.jpg" cover_url = "http://www.newyorker.com/images/covers/1925/1925_02_21_p233.jpg"
@ -68,13 +48,233 @@ class NewYorker(BasicNewsRecipe):
cover_url = 'http://www.newyorker.com' + cover_item.div.img['src'].strip() cover_url = 'http://www.newyorker.com' + cover_item.div.img['src'].strip()
return cover_url return cover_url
def preprocess_html(self, soup): def fixChars(self,string):
for item in soup.findAll(style=True): # Replace lsquo (\x91)
del item['style'] fixed = re.sub("\x91","",string)
auth = soup.find(attrs={'id':'articleauthor'}) # Replace rsquo (\x92)
if auth: fixed = re.sub("\x92","",fixed)
alink = auth.find('a') # Replace ldquo (\x93)
if alink and alink.string is not None: fixed = re.sub("\x93","“",fixed)
txt = alink.string # Replace rdquo (\x94)
alink.replaceWith(txt) fixed = re.sub("\x94","”",fixed)
# Replace ndash (\x96)
fixed = re.sub("\x96","",fixed)
# Replace mdash (\x97)
fixed = re.sub("\x97","—",fixed)
fixed = re.sub("&#x2019;","",fixed)
return fixed
def massageNCXText(self, description):
# Kindle TOC descriptions won't render certain characters
if description:
massaged = unicode(BeautifulStoneSoup(description, convertEntities=BeautifulStoneSoup.HTML_ENTITIES))
# Replace '&' with '&'
massaged = re.sub("&","&", massaged)
return self.fixChars(massaged)
else:
return description
def populate_article_metadata(self, article, soup, first):
if first:
picdiv = soup.find('body').find('img')
if picdiv is not None:
self.add_toc_thumbnail(article,re.sub(r'links\\link\d+\\','',picdiv['src']))
xtitle = article.text_summary.strip()
if len(xtitle) == 0:
desc = soup.find('meta',attrs={'property':'og:description'})
if desc is not None:
article.summary = article.text_summary = desc['content']
shortparagraph = ""
## try:
if len(article.text_summary.strip()) == 0:
articlebodies = soup.findAll('div',attrs={'class':'entry-content'})
if articlebodies:
for articlebody in articlebodies:
if articlebody:
paras = articlebody.findAll('p')
for p in paras:
refparagraph = self.massageNCXText(self.tag_to_string(p,use_alt=False)).strip()
#account for blank paragraphs and short paragraphs by appending them to longer ones
if len(refparagraph) > 0:
if len(refparagraph) > 70: #approximately one line of text
newpara = shortparagraph + refparagraph
article.summary = article.text_summary = newpara.strip()
return
else:
shortparagraph = refparagraph + " "
if shortparagraph.strip().find(" ") == -1 and not shortparagraph.strip().endswith(":"):
shortparagraph = shortparagraph + "- "
else:
article.summary = article.text_summary = self.massageNCXText(article.text_summary)
## except:
## self.log("Error creating article descriptions")
## return
def strip_anchors(self,soup):
paras = soup.findAll(True)
for para in paras:
aTags = para.findAll('a')
for a in aTags:
if a.img is None:
a.replaceWith(a.renderContents().decode('cp1252','replace'))
return soup return soup
def preprocess_html(self,soup):
dateline = soup.find('div','published')
byline = soup.find('div','byline')
title = soup.find('h1','entry-title')
if title is None:
return self.strip_anchors(soup)
if byline is None:
title.append(dateline)
return self.strip_anchors(soup)
byline.append(dateline)
return self.strip_anchors(soup)
def load_global_nav(self,soup):
seclist = []
ul = soup.find('ul',attrs={'id':re.compile('global-nav-menu')})
if ul is not None:
for li in ul.findAll('li'):
if li.a is not None:
securl = li.a['href']
if securl != '/' and securl != '/magazine' and securl.startswith('/'):
seclist.append((self.tag_to_string(li.a),self.newyorker_prefix+securl))
return seclist
def exclude_url(self,url):
if url in self.url_list:
return True
if not url.endswith('html'):
return True
if 'goings-on-about-town-app' in url:
return True
if 'something-to-be-thankful-for' in url:
return True
if '/shouts/' in url:
return True
if 'out-loud' in url:
return True
if '/rss/' in url:
return True
if '/video-' in url:
return True
self.url_list.append(url)
return False
def load_index_page(self,soup):
article_list = []
for div in soup.findAll('div',attrs={'class':re.compile('^rotator')}):
h2 = div.h2
if h2 is not None:
a = h2.a
if a is not None:
url = a['href']
if not self.exclude_url(url):
if url.startswith('/'):
url = self.newyorker_prefix+url
byline = h2.span
if byline is not None:
author = self.tag_to_string(byline)
if author.startswith('by '):
author.replace('by ','')
byline.extract()
else:
author = ''
if h2.br is not None:
h2.br.replaceWith(' ')
title = self.tag_to_string(h2)
desc = div.find(attrs={'class':['rotator-ad-body','feature-blurb-text']})
if desc is not None:
description = self.tag_to_string(desc)
else:
description = ''
article_list.append(dict(title=title,url=url,date='',description=description,author=author,content=''))
ul = div.find('ul','feature-blurb-links')
if ul is not None:
for li in ul.findAll('li'):
a = li.a
if a is not None:
url = a['href']
if not self.exclude_url(url):
if url.startswith('/'):
url = self.newyorker_prefix+url
if a.br is not None:
a.br.replaceWith(' ')
title = '>>'+self.tag_to_string(a)
article_list.append(dict(title=title,url=url,date='',description='',author='',content=''))
for h3 in soup.findAll('h3','header'):
a = h3.a
if a is not None:
url = a['href']
if not self.exclude_url(url):
if url.startswith('/'):
url = self.newyorker_prefix+url
byline = h3.span
if byline is not None:
author = self.tag_to_string(byline)
if author.startswith('by '):
author = author.replace('by ','')
byline.extract()
else:
author = ''
if h3.br is not None:
h3.br.replaceWith(' ')
title = self.tag_to_string(h3).strip()
article_list.append(dict(title=title,url=url,date='',description='',author=author,content=''))
return article_list
def load_global_section(self,securl):
article_list = []
try:
soup = self.index_to_soup(securl)
except:
return article_list
if '/blogs/' not in securl:
return self.load_index_page(soup)
for div in soup.findAll('div',attrs={'id':re.compile('^entry')}):
h3 = div.h3
if h3 is not None:
a = h3.a
if a is not None:
url = a['href']
if not self.exclude_url(url):
if url.startswith('/'):
url = self.newyorker_prefix+url
if h3.br is not None:
h3.br.replaceWith(' ')
title = self.tag_to_string(h3)
article_list.append(dict(title=title,url=url,date='',description='',author='',content=''))
return article_list
def filter_ans(self, ans) :
total_article_count = 0
idx = 0
idx_max = len(ans)-1
while idx <= idx_max:
if True: #self.verbose
self.log("Section %s: %d articles" % (ans[idx][0], len(ans[idx][1])) )
for article in ans[idx][1]:
total_article_count += 1
if True: #self.verbose
self.log("\t%-40.40s... \t%-60.60s..." % (article['title'].encode('cp1252','replace'),
article['url'].replace('http://m.newyorker.com','').encode('cp1252','replace')))
idx = idx+1
self.log( "Queued %d articles" % total_article_count )
return ans
def parse_index(self):
ans = []
try:
soup = self.index_to_soup(self.newyorker_prefix)
except:
return ans
seclist = self.load_global_nav(soup)
ans.append(('Front Page',self.load_index_page(soup)))
for (sectitle,securl) in seclist:
ans.append((sectitle,self.load_global_section(securl)))
return self.filter_ans(ans)

View File

@ -7,7 +7,6 @@ sfgate.com
''' '''
from calibre.web.feeds.news import BasicNewsRecipe from calibre.web.feeds.news import BasicNewsRecipe
import re
class SanFranciscoChronicle(BasicNewsRecipe): class SanFranciscoChronicle(BasicNewsRecipe):
title = u'San Francisco Chronicle' title = u'San Francisco Chronicle'
@ -19,16 +18,7 @@ class SanFranciscoChronicle(BasicNewsRecipe):
max_articles_per_feed = 100 max_articles_per_feed = 100
no_stylesheets = True no_stylesheets = True
use_embedded_content = False use_embedded_content = False
auto_cleanup = True
remove_tags_before = {'id':'printheader'}
remove_tags = [
dict(name='div',attrs={'id':'printheader'})
,dict(name='a', attrs={'href':re.compile('http://ads\.pheedo\.com.*')})
,dict(name='div',attrs={'id':'footer'})
]
extra_css = ''' extra_css = '''
h1{font-family :Arial,Helvetica,sans-serif; font-size:large;} h1{font-family :Arial,Helvetica,sans-serif; font-size:large;}
@ -43,33 +33,13 @@ class SanFranciscoChronicle(BasicNewsRecipe):
''' '''
feeds = [ feeds = [
(u'Top News Stories', u'http://www.sfgate.com/rss/feeds/news.xml') (u'Bay Area News', u'http://www.sfgate.com/bayarea/feed/Bay-Area-News-429.php'),
(u'City Insider', u'http://www.sfgate.com/default/feed/City-Insider-Blog-573.php'),
(u'Crime Scene', u'http://www.sfgate.com/rss/feed/Crime-Scene-Blog-599.php'),
(u'Education News', u'http://www.sfgate.com/education/feed/Education-News-from-SFGate-430.php'),
(u'National News', u'http://www.sfgate.com/rss/feed/National-News-RSS-Feed-435.php'),
(u'Weird News', u'http://www.sfgate.com/weird/feed/Weird-News-RSS-Feed-433.php'),
(u'World News', u'http://www.sfgate.com/rss/feed/World-News-From-SFGate-432.php'),
] ]
def print_version(self,url):
url= url +"&type=printable"
return url
def get_article_url(self, article):
print str(article['title_detail']['value'])
url = article.get('guid',None)
url = "http://www.sfgate.com/cgi-bin/article.cgi?f="+url
if "Presented By:" in str(article['title_detail']['value']):
url = ''
return url

View File

@ -1,8 +1,11 @@
#!/usr/bin/env python #!/usr/bin/env python
__license__ = 'GPL v3' __license__ = 'GPL v3'
__author__ = 'Lorenzo Vigentini' __author__ = 'Lorenzo Vigentini and Tom Surace'
__copyright__ = '2009, Lorenzo Vigentini <l.vigentini at gmail.com>' __copyright__ = '2009, Lorenzo Vigentini <l.vigentini at gmail.com>, 2013 Tom Surace <tekhedd@byteheaven.net>'
description = 'the Escapist Magazine - v1.02 (09, January 2010)' description = 'The Escapist Magazine - v1.3 (2013, April 2013)'
#
# Based on 'the Escapist Magazine - v1.02 (09, January 2010)'
''' '''
http://www.escapistmagazine.com/ http://www.escapistmagazine.com/
@ -11,12 +14,11 @@ http://www.escapistmagazine.com/
from calibre.web.feeds.news import BasicNewsRecipe from calibre.web.feeds.news import BasicNewsRecipe
class al(BasicNewsRecipe): class al(BasicNewsRecipe):
author = 'Lorenzo Vigentini' author = 'Lorenzo Vigentini and Tom Surace'
description = 'The Escapist Magazine' description = 'The Escapist Magazine'
cover_url = 'http://cdn.themis-media.com/themes/escapistmagazine/default/images/logo.png' cover_url = 'http://cdn.themis-media.com/themes/escapistmagazine/default/images/logo.png'
title = u'The Escapist Magazine' title = u'The Escapist Magazine'
publisher = 'Themis media' publisher = 'Themis Media'
category = 'Video games news, lifestyle, gaming culture' category = 'Video games news, lifestyle, gaming culture'
language = 'en' language = 'en'
@ -36,18 +38,19 @@ class al(BasicNewsRecipe):
] ]
def print_version(self,url): def print_version(self,url):
# Expect article url in the format:
# http://www.escapistmagazine.com/news/view/123198-article-name?utm_source=rss&utm_medium=rss&utm_campaign=news
#
baseURL='http://www.escapistmagazine.com' baseURL='http://www.escapistmagazine.com'
segments = url.split('/') segments = url.split('/')
#basename = '/'.join(segments[:3]) + '/'
subPath= '/'+ segments[3] + '/' subPath= '/'+ segments[3] + '/'
articleURL=(segments[len(segments)-1])[0:5]
if articleURL[4] =='-': # The article number is the "number" that starts the name
articleURL=articleURL[:4] articleNumber = segments[len(segments)-1]; # the "article name"
articleNumber = articleNumber.split('-')[0]; # keep part before hyphen
printVerString='print/'+ articleURL fullUrl = baseURL + subPath + 'print/' + articleNumber
s= baseURL + subPath + printVerString return fullUrl
return s
keep_only_tags = [ keep_only_tags = [
dict(name='div', attrs={'id':'article'}) dict(name='div', attrs={'id':'article'})

View File

@ -1,5 +1,5 @@
__license__ = 'GPL v3' __license__ = 'GPL v3'
__copyright__ = '2009-2011, Darko Miletic <darko.miletic at gmail.com>' __copyright__ = '2009-2013, Darko Miletic <darko.miletic at gmail.com>'
''' '''
theonion.com theonion.com
@ -10,7 +10,7 @@ from calibre.web.feeds.news import BasicNewsRecipe
class TheOnion(BasicNewsRecipe): class TheOnion(BasicNewsRecipe):
title = 'The Onion' title = 'The Onion'
__author__ = 'Darko Miletic' __author__ = 'Darko Miletic'
description = "America's finest news source" description = "The Onion, America's Finest News Source, is an award-winning publication covering world, national, and * local issues. It is updated daily online and distributed weekly in select American cities."
oldest_article = 2 oldest_article = 2
max_articles_per_feed = 100 max_articles_per_feed = 100
publisher = 'Onion, Inc.' publisher = 'Onion, Inc.'
@ -20,7 +20,8 @@ class TheOnion(BasicNewsRecipe):
use_embedded_content = False use_embedded_content = False
encoding = 'utf-8' encoding = 'utf-8'
publication_type = 'newsportal' publication_type = 'newsportal'
masthead_url = 'http://o.onionstatic.com/img/headers/onion_190.png' needs_subscription = 'optional'
masthead_url = 'http://www.theonion.com/static/onion/img/logo_1x.png'
extra_css = """ extra_css = """
body{font-family: Helvetica,Arial,sans-serif} body{font-family: Helvetica,Arial,sans-serif}
.section_title{color: gray; text-transform: uppercase} .section_title{color: gray; text-transform: uppercase}
@ -37,18 +38,12 @@ class TheOnion(BasicNewsRecipe):
, 'language' : language , 'language' : language
} }
keep_only_tags = [ keep_only_tags = [dict(attrs={'class':'full-article'})]
dict(name='h2', attrs={'class':['section_title','title']}) remove_attributes = ['lang','rel']
,dict(attrs={'class':['main_image','meta','article_photo_lead','article_body']}) remove_tags = [
,dict(attrs={'id':['entries']}) dict(name=['object','link','iframe','base','meta'])
] ,dict(attrs={'class':lambda x: x and 'share-tools' in x.split()})
remove_attributes=['lang','rel'] ]
remove_tags_after = dict(attrs={'class':['article_body','feature_content']})
remove_tags = [
dict(name=['object','link','iframe','base','meta'])
,dict(name='div', attrs={'class':['toolbar_side','graphical_feature','toolbar_bottom']})
,dict(name='div', attrs={'id':['recent_slider','sidebar','pagination','related_media']})
]
feeds = [ feeds = [
@ -56,6 +51,17 @@ class TheOnion(BasicNewsRecipe):
,(u'Sports' , u'http://feeds.theonion.com/theonion/sports' ) ,(u'Sports' , u'http://feeds.theonion.com/theonion/sports' )
] ]
def get_browser(self):
br = BasicNewsRecipe.get_browser(self)
br.open('http://www.theonion.com/')
if self.username is not None and self.password is not None:
br.open('https://ui.ppjol.com/login/onion/u/j_spring_security_check')
br.select_form(name='f')
br['j_username'] = self.username
br['j_password'] = self.password
br.submit()
return br
def get_article_url(self, article): def get_article_url(self, article):
artl = BasicNewsRecipe.get_article_url(self, article) artl = BasicNewsRecipe.get_article_url(self, article)
if artl.startswith('http://www.theonion.com/audio/'): if artl.startswith('http://www.theonion.com/audio/'):
@ -79,4 +85,8 @@ class TheOnion(BasicNewsRecipe):
else: else:
str = self.tag_to_string(item) str = self.tag_to_string(item)
item.replaceWith(str) item.replaceWith(str)
for item in soup.findAll('img'):
if item.has_key('data-src'):
item['src'] = item['data-src']
return soup return soup

View File

@ -1,7 +1,5 @@
#!/usr/bin/env python
__license__ = 'GPL v3' __license__ = 'GPL v3'
__copyright__ = '2008-2009, Darko Miletic <darko.miletic at gmail.com>' __copyright__ = '2008-2013, Darko Miletic <darko.miletic at gmail.com>'
''' '''
tomshardware.com/us tomshardware.com/us
''' '''
@ -16,22 +14,20 @@ class Tomshardware(BasicNewsRecipe):
publisher = "Tom's Hardware" publisher = "Tom's Hardware"
category = 'news, IT, hardware, USA' category = 'news, IT, hardware, USA'
no_stylesheets = True no_stylesheets = True
needs_subscription = True needs_subscription = 'optional'
language = 'en' language = 'en'
INDEX = 'http://www.tomshardware.com' INDEX = 'http://www.tomshardware.com'
LOGIN = INDEX + '/membres/' LOGIN = INDEX + '/membres/'
remove_javascript = True remove_javascript = True
use_embedded_content= False use_embedded_content= False
html2lrf_options = [ conversion_options = {
'--comment', description 'comment' : description
, '--category', category , 'tags' : category
, '--publisher', publisher , 'publisher' : publisher
] , 'language' : language
}
html2epub_options = 'publisher="' + publisher + '"\ncomments="' + description + '"\ntags="' + category + '"'
def get_browser(self): def get_browser(self):
br = BasicNewsRecipe.get_browser(self) br = BasicNewsRecipe.get_browser(self)
br.open(self.INDEX+'/us/') br.open(self.INDEX+'/us/')
@ -50,8 +46,8 @@ class Tomshardware(BasicNewsRecipe):
] ]
feeds = [ feeds = [
(u'Latest Articles', u'http://www.tomshardware.com/feeds/atom/tom-s-hardware-us,18-2.xml' ) (u'Reviews', u'http://www.tomshardware.com/feeds/rss2/tom-s-hardware-us,18-2.xml')
,(u'Latest News' , u'http://www.tomshardware.com/feeds/atom/tom-s-hardware-us,18-1.xml') ,(u'News' , u'http://www.tomshardware.com/feeds/rss2/tom-s-hardware-us,18-1.xml')
] ]
def print_version(self, url): def print_version(self, url):

View File

@ -1,5 +1,6 @@
# vim:fileencoding=UTF-8:ts=4:sw=4:sta:et:sts=4:ai # vim:fileencoding=UTF-8:ts=4:sw=4:sta:et:sts=4:ai
import re
from calibre.web.feeds.recipes import BasicNewsRecipe from calibre.web.feeds.recipes import BasicNewsRecipe
class TVXS(BasicNewsRecipe): class TVXS(BasicNewsRecipe):
@ -8,19 +9,30 @@ class TVXS(BasicNewsRecipe):
description = 'News from Greece' description = 'News from Greece'
max_articles_per_feed = 100 max_articles_per_feed = 100
oldest_article = 3 oldest_article = 3
simultaneous_downloads = 1
publisher = 'TVXS' publisher = 'TVXS'
category = 'news, GR' category = 'news, sport, greece'
language = 'el' language = 'el'
encoding = None encoding = None
use_embedded_content = False use_embedded_content = False
remove_empty_feeds = True remove_empty_feeds = True
#conversion_options = { 'linearize_tables': True} conversion_options = {'smarten_punctuation': True}
no_stylesheets = True no_stylesheets = True
publication_type = 'newspaper'
remove_tags_before = dict(name='h1',attrs={'class':'print-title'}) remove_tags_before = dict(name='h1',attrs={'class':'print-title'})
remove_tags_after = dict(name='div',attrs={'class':'field field-type-relevant-content field-field-relevant-articles'}) remove_tags_after = dict(name='div',attrs={'class':'field field-type-relevant-content field-field-relevant-articles'})
remove_attributes = ['width', 'src', 'header', 'footer'] remove_tags = [dict(name='div',attrs={'class':'field field-type-relevant-content field-field-relevant-articles'}),
dict(name='div',attrs={'class':'field field-type-filefield field-field-image-gallery'}),
dict(name='div',attrs={'class':'filefield-file'})]
remove_attributes = ['border', 'cellspacing', 'align', 'cellpadding', 'colspan', 'valign', 'vspace', 'hspace', 'alt', 'width', 'height']
extra_css = 'body { font-family: verdana, helvetica, sans-serif; } \
table { width: 100%; } \
td img { display: block; margin: 5px auto; } \
ul { padding-top: 10px; } \
ol { padding-top: 10px; } \
li { padding-top: 5px; padding-bottom: 5px; } \
h1 { text-align: center; font-size: 125%; font-weight: bold; } \
h2, h3, h4, h5, h6 { text-align: center; font-size: 100%; font-weight: bold; }'
preprocess_regexps = [(re.compile(r'<br[ ]*/>', re.IGNORECASE), lambda m: ''), (re.compile(r'<br[ ]*clear.*/>', re.IGNORECASE), lambda m: '')]
feeds = [(u'Ελλάδα', 'http://tvxs.gr/feeds/2/feed.xml'), feeds = [(u'Ελλάδα', 'http://tvxs.gr/feeds/2/feed.xml'),
(u'Κόσμος', 'http://tvxs.gr/feeds/5/feed.xml'), (u'Κόσμος', 'http://tvxs.gr/feeds/5/feed.xml'),
@ -35,17 +47,10 @@ class TVXS(BasicNewsRecipe):
(u'Ιστορία', 'http://tvxs.gr/feeds/1573/feed.xml'), (u'Ιστορία', 'http://tvxs.gr/feeds/1573/feed.xml'),
(u'Χιούμορ', 'http://tvxs.gr/feeds/692/feed.xml')] (u'Χιούμορ', 'http://tvxs.gr/feeds/692/feed.xml')]
def print_version(self, url): def print_version(self, url):
import urllib2, urlparse, StringIO, gzip br = self.get_browser()
response = br.open(url)
fp = urllib2.urlopen(url) data = response.read()
data = fp.read()
if fp.info()['content-encoding'] == 'gzip':
gzip_data = StringIO.StringIO(data)
gzipper = gzip.GzipFile(fileobj=gzip_data)
data = gzipper.read()
fp.close()
pos_1 = data.find('<a href="/print/') pos_1 = data.find('<a href="/print/')
if pos_1 == -1: if pos_1 == -1:
@ -57,5 +62,5 @@ class TVXS(BasicNewsRecipe):
pos_1 += len('<a href="') pos_1 += len('<a href="')
new_url = data[pos_1:pos_2] new_url = data[pos_1:pos_2]
print_url = urlparse.urljoin(url, new_url) print_url = "http://tvxs.gr" + new_url
return print_url return print_url

View File

@ -0,0 +1,17 @@
from calibre.web.feeds.news import BasicNewsRecipe
class UniverseToday(BasicNewsRecipe):
title = u'Universe Today'
language = 'en'
description = u'Space and astronomy news.'
__author__ = 'seird'
publisher = u'universetoday.com'
category = 'science, astronomy, news, rss'
oldest_article = 7
max_articles_per_feed = 40
auto_cleanup = True
no_stylesheets = True
use_embedded_content = False
remove_empty_feeds = True
feeds = [(u'Universe Today', u'http://feeds.feedburner.com/universetoday/pYdq')]

View File

@ -6,17 +6,62 @@ __license__ = 'GPL v3'
www.canada.com www.canada.com
''' '''
import re import re
from calibre.web.feeds.recipes import BasicNewsRecipe from calibre.web.feeds.news import BasicNewsRecipe
from calibre.ebooks.BeautifulSoup import Tag, BeautifulStoneSoup from calibre.ebooks.BeautifulSoup import Tag, BeautifulStoneSoup
class TimesColonist(BasicNewsRecipe): class TimesColonist(BasicNewsRecipe):
# Customization -- remove sections you don't want.
# If your e-reader is an e-ink Kindle and your output profile is
# set properly this recipe will not include images because the
# resulting file is too large. If you have one of these and want
# images you can set kindle_omit_images = False
# and remove sections (typically the e-ink Kindles will
# work with about a dozen of these, but your mileage may vary).
kindle_omit_images = True
section_list = [
('','Web Front Page'),
('news/','News Headlines'),
('news/b-c/','BC News'),
('news/national/','National News'),
('news/world/','World News'),
('opinion/','Opinion'),
('opinion/letters/','Letters'),
('business/','Business'),
('business/money/','Money'),
('business/technology/','Technology'),
('business/working/','Working'),
('sports/','Sports'),
('sports/hockey/','Hockey'),
('sports/football/','Football'),
('sports/basketball/','Basketball'),
('sports/golf/','Golf'),
('entertainment/','entertainment'),
('entertainment/go/','Go!'),
('entertainment/music/','Music'),
('entertainment/books/','Books'),
('entertainment/Movies/','Movies'),
('entertainment/television/','Television'),
('life/','Life'),
('life/health/','Health'),
('life/travel/','Travel'),
('life/driving/','Driving'),
('life/homes/','Homes'),
('life/food-drink/','Food & Drink')
]
title = u'Victoria Times Colonist' title = u'Victoria Times Colonist'
url_prefix = 'http://www.timescolonist.com' url_prefix = 'http://www.timescolonist.com'
description = u'News from Victoria, BC' description = u'News from Victoria, BC'
fp_tag = 'CAN_TC' fp_tag = 'CAN_TC'
masthead_url = 'http://www.timescolonist.com/gmg/img/global/logoTimesColonist.png'
url_list = [] url_list = []
language = 'en_CA' language = 'en_CA'
__author__ = 'Nick Redding' __author__ = 'Nick Redding'
@ -29,15 +74,21 @@ class TimesColonist(BasicNewsRecipe):
.caption { font-size: xx-small; font-style: italic; font-weight: normal; } .caption { font-size: xx-small; font-style: italic; font-weight: normal; }
''' '''
keep_only_tags = [dict(name='div', attrs={'class':re.compile('main.content')})] keep_only_tags = [dict(name='div', attrs={'class':re.compile('main.content')})]
remove_tags = [{'class':'comments'},
{'id':'photocredit'},
dict(name='div', attrs={'class':re.compile('top.controls')}),
dict(name='div', attrs={'class':re.compile('social')}),
dict(name='div', attrs={'class':re.compile('tools')}),
dict(name='div', attrs={'class':re.compile('bottom.tools')}),
dict(name='div', attrs={'class':re.compile('window')}),
dict(name='div', attrs={'class':re.compile('related.news.element')})]
def __init__(self, options, log, progress_reporter):
self.remove_tags = [{'class':'comments'},
{'id':'photocredit'},
dict(name='div', attrs={'class':re.compile('top.controls')}),
dict(name='div', attrs={'class':re.compile('^comments')}),
dict(name='div', attrs={'class':re.compile('social')}),
dict(name='div', attrs={'class':re.compile('tools')}),
dict(name='div', attrs={'class':re.compile('bottom.tools')}),
dict(name='div', attrs={'class':re.compile('window')}),
dict(name='div', attrs={'class':re.compile('related.news.element')})]
print("PROFILE NAME = "+options.output_profile.short_name)
if self.kindle_omit_images and options.output_profile.short_name in ['kindle', 'kindle_dx', 'kindle_pw']:
self.remove_tags.append(dict(name='div', attrs={'class':re.compile('image-container')}))
BasicNewsRecipe.__init__(self, options, log, progress_reporter)
def get_cover_url(self): def get_cover_url(self):
from datetime import timedelta, date from datetime import timedelta, date
@ -122,7 +173,6 @@ class TimesColonist(BasicNewsRecipe):
def preprocess_html(self,soup): def preprocess_html(self,soup):
byline = soup.find('p',attrs={'class':re.compile('ancillary')}) byline = soup.find('p',attrs={'class':re.compile('ancillary')})
if byline is not None: if byline is not None:
byline.find('a')
authstr = self.tag_to_string(byline,False) authstr = self.tag_to_string(byline,False)
authstr = re.sub('/ *Times Colonist','/',authstr, flags=re.IGNORECASE) authstr = re.sub('/ *Times Colonist','/',authstr, flags=re.IGNORECASE)
authstr = re.sub('BY */','',authstr, flags=re.IGNORECASE) authstr = re.sub('BY */','',authstr, flags=re.IGNORECASE)
@ -149,9 +199,10 @@ class TimesColonist(BasicNewsRecipe):
atag = htag.a atag = htag.a
if atag is not None: if atag is not None:
url = atag['href'] url = atag['href']
#print("Checking "+url) url = url.strip()
if atag['href'].startswith('/'): # print("Checking >>"+url+'<<\n\r')
url = self.url_prefix+atag['href'] if url.startswith('/'):
url = self.url_prefix+url
if url in self.url_list: if url in self.url_list:
return return
self.url_list.append(url) self.url_list.append(url)
@ -171,10 +222,10 @@ class TimesColonist(BasicNewsRecipe):
if dtag is not None: if dtag is not None:
description = self.tag_to_string(dtag,False) description = self.tag_to_string(dtag,False)
article_list.append(dict(title=title,url=url,date='',description=description,author='',content='')) article_list.append(dict(title=title,url=url,date='',description=description,author='',content=''))
#print(sectitle+title+": description = "+description+" URL="+url) print(sectitle+title+": description = "+description+" URL="+url+'\n\r')
def add_section_index(self,ans,securl,sectitle): def add_section_index(self,ans,securl,sectitle):
print("Add section url="+self.url_prefix+'/'+securl) print("Add section url="+self.url_prefix+'/'+securl+'\n\r')
try: try:
soup = self.index_to_soup(self.url_prefix+'/'+securl) soup = self.index_to_soup(self.url_prefix+'/'+securl)
except: except:
@ -193,33 +244,7 @@ class TimesColonist(BasicNewsRecipe):
def parse_index(self): def parse_index(self):
ans = [] ans = []
ans = self.add_section_index(ans,'','Web Front Page') for (url,title) in self.section_list:
ans = self.add_section_index(ans,'news/','News Headlines') ans = self.add_section_index(ans,url,title)
ans = self.add_section_index(ans,'news/b-c/','BC News')
ans = self.add_section_index(ans,'news/national/','Natioanl News')
ans = self.add_section_index(ans,'news/world/','World News')
ans = self.add_section_index(ans,'opinion/','Opinion')
ans = self.add_section_index(ans,'opinion/letters/','Letters')
ans = self.add_section_index(ans,'business/','Business')
ans = self.add_section_index(ans,'business/money/','Money')
ans = self.add_section_index(ans,'business/technology/','Technology')
ans = self.add_section_index(ans,'business/working/','Working')
ans = self.add_section_index(ans,'sports/','Sports')
ans = self.add_section_index(ans,'sports/hockey/','Hockey')
ans = self.add_section_index(ans,'sports/football/','Football')
ans = self.add_section_index(ans,'sports/basketball/','Basketball')
ans = self.add_section_index(ans,'sports/golf/','Golf')
ans = self.add_section_index(ans,'entertainment/','entertainment')
ans = self.add_section_index(ans,'entertainment/go/','Go!')
ans = self.add_section_index(ans,'entertainment/music/','Music')
ans = self.add_section_index(ans,'entertainment/books/','Books')
ans = self.add_section_index(ans,'entertainment/Movies/','movies')
ans = self.add_section_index(ans,'entertainment/television/','Television')
ans = self.add_section_index(ans,'life/','Life')
ans = self.add_section_index(ans,'life/health/','Health')
ans = self.add_section_index(ans,'life/travel/','Travel')
ans = self.add_section_index(ans,'life/driving/','Driving')
ans = self.add_section_index(ans,'life/homes/','Homes')
ans = self.add_section_index(ans,'life/food-drink/','Food & Drink')
return ans return ans

View File

@ -1,6 +1,3 @@
" Project wide builtins
let $PYFLAKES_BUILTINS = "_,dynamic_property,__,P,I,lopen,icu_lower,icu_upper,icu_title,ngettext"
" Include directories for C++ modules " Include directories for C++ modules
let g:syntastic_cpp_include_dirs = [ let g:syntastic_cpp_include_dirs = [
\'/usr/include/python2.7', \'/usr/include/python2.7',

4
setup.cfg Normal file
View File

@ -0,0 +1,4 @@
[flake8]
max-line-length = 160
builtins = _,dynamic_property,__,P,I,lopen,icu_lower,icu_upper,icu_title,ngettext
ignore = E12,E22,E231,E301,E302,E304,E401,W391

View File

@ -22,40 +22,12 @@ class Message:
self.filename, self.lineno, self.msg = filename, lineno, msg self.filename, self.lineno, self.msg = filename, lineno, msg
def __str__(self): def __str__(self):
return '%s:%s: %s'%(self.filename, self.lineno, self.msg) return '%s:%s: %s' % (self.filename, self.lineno, self.msg)
def check_for_python_errors(code_string, filename):
import _ast
# First, compile into an AST and handle syntax errors.
try:
tree = compile(code_string, filename, "exec", _ast.PyCF_ONLY_AST)
except (SyntaxError, IndentationError) as value:
msg = value.args[0]
(lineno, offset, text) = value.lineno, value.offset, value.text
# If there's an encoding problem with the file, the text is None.
if text is None:
# Avoid using msg, since for the only known case, it contains a
# bogus message that claims the encoding the file declared was
# unknown.
msg = "%s: problem decoding source" % filename
return [Message(filename, lineno, msg)]
else:
checker = __import__('pyflakes.checker').checker
# Okay, it's syntactically valid. Now check it.
w = checker.Checker(tree, filename)
w.messages.sort(lambda a, b: cmp(a.lineno, b.lineno))
return [Message(x.filename, x.lineno, x.message%x.message_args) for x in
w.messages]
class Check(Command): class Check(Command):
description = 'Check for errors in the calibre source code' description = 'Check for errors in the calibre source code'
BUILTINS = ['_', '__', 'dynamic_property', 'I', 'P', 'lopen', 'icu_lower',
'icu_upper', 'icu_title', 'ngettext']
CACHE = '.check-cache.pickle' CACHE = '.check-cache.pickle'
def get_files(self, cache): def get_files(self, cache):
@ -65,10 +37,10 @@ class Check(Command):
mtime = os.stat(y).st_mtime mtime = os.stat(y).st_mtime
if cache.get(y, 0) == mtime: if cache.get(y, 0) == mtime:
continue continue
if (f.endswith('.py') and f not in ('feedparser.py', if (f.endswith('.py') and f not in (
'pyparsing.py', 'markdown.py') and 'feedparser.py', 'pyparsing.py', 'markdown.py') and
'prs500/driver.py' not in y): 'prs500/driver.py' not in y):
yield y, mtime yield y, mtime
if f.endswith('.coffee'): if f.endswith('.coffee'):
yield y, mtime yield y, mtime
@ -79,25 +51,22 @@ class Check(Command):
if f.endswith('.recipe') and cache.get(f, 0) != mtime: if f.endswith('.recipe') and cache.get(f, 0) != mtime:
yield f, mtime yield f, mtime
def run(self, opts): def run(self, opts):
cache = {} cache = {}
if os.path.exists(self.CACHE): if os.path.exists(self.CACHE):
cache = cPickle.load(open(self.CACHE, 'rb')) cache = cPickle.load(open(self.CACHE, 'rb'))
builtins = list(set_builtins(self.BUILTINS))
for f, mtime in self.get_files(cache): for f, mtime in self.get_files(cache):
self.info('\tChecking', f) self.info('\tChecking', f)
errors = False errors = False
ext = os.path.splitext(f)[1] ext = os.path.splitext(f)[1]
if ext in {'.py', '.recipe'}: if ext in {'.py', '.recipe'}:
w = check_for_python_errors(open(f, 'rb').read(), f) p = subprocess.Popen(['flake8', '--ignore=E,W', f])
if w: if p.wait() != 0:
errors = True errors = True
self.report_errors(w)
else: else:
from calibre.utils.serve_coffee import check_coffeescript from calibre.utils.serve_coffee import check_coffeescript
try: try:
check_coffeescript(f) check_coffeescript(f)
except: except:
errors = True errors = True
if errors: if errors:
@ -106,8 +75,6 @@ class Check(Command):
self.j(self.SRC, '../session.vim'), '-f', f]) self.j(self.SRC, '../session.vim'), '-f', f])
raise SystemExit(1) raise SystemExit(1)
cache[f] = mtime cache[f] = mtime
for x in builtins:
delattr(__builtin__, x)
cPickle.dump(cache, open(self.CACHE, 'wb'), -1) cPickle.dump(cache, open(self.CACHE, 'wb'), -1)
wn_path = os.path.expanduser('~/work/servers/src/calibre_servers/main') wn_path = os.path.expanduser('~/work/servers/src/calibre_servers/main')
if os.path.exists(wn_path): if os.path.exists(wn_path):

File diff suppressed because it is too large Load Diff

View File

@ -4,7 +4,7 @@ __license__ = 'GPL v3'
__copyright__ = '2008, Kovid Goyal kovid@kovidgoyal.net' __copyright__ = '2008, Kovid Goyal kovid@kovidgoyal.net'
__docformat__ = 'restructuredtext en' __docformat__ = 'restructuredtext en'
__appname__ = u'calibre' __appname__ = u'calibre'
numeric_version = (0, 9, 26) numeric_version = (0, 9, 27)
__version__ = u'.'.join(map(unicode, numeric_version)) __version__ = u'.'.join(map(unicode, numeric_version))
__author__ = u"Kovid Goyal <kovid@kovidgoyal.net>" __author__ = u"Kovid Goyal <kovid@kovidgoyal.net>"

View File

@ -757,9 +757,10 @@ from calibre.ebooks.metadata.sources.isbndb import ISBNDB
from calibre.ebooks.metadata.sources.overdrive import OverDrive from calibre.ebooks.metadata.sources.overdrive import OverDrive
from calibre.ebooks.metadata.sources.douban import Douban from calibre.ebooks.metadata.sources.douban import Douban
from calibre.ebooks.metadata.sources.ozon import Ozon from calibre.ebooks.metadata.sources.ozon import Ozon
# from calibre.ebooks.metadata.sources.google_images import GoogleImages from calibre.ebooks.metadata.sources.google_images import GoogleImages
from calibre.ebooks.metadata.sources.big_book_search import BigBookSearch
plugins += [GoogleBooks, Amazon, Edelweiss, OpenLibrary, ISBNDB, OverDrive, Douban, Ozon] plugins += [GoogleBooks, GoogleImages, Amazon, Edelweiss, OpenLibrary, ISBNDB, OverDrive, Douban, Ozon, BigBookSearch]
# }}} # }}}
@ -1467,6 +1468,17 @@ class StoreKoboStore(StoreBase):
formats = ['EPUB'] formats = ['EPUB']
affiliate = True affiliate = True
class StoreKoobeStore(StoreBase):
name = 'Koobe'
author = u'Tomasz Długosz'
description = u'Księgarnia internetowa oferuje ebooki (książki elektroniczne) w postaci plików epub, mobi i pdf.'
actual_plugin = 'calibre.gui2.store.stores.koobe_plugin:KoobeStore'
drm_free_only = True
headquarters = 'PL'
formats = ['EPUB', 'MOBI', 'PDF']
affiliate = True
class StoreLegimiStore(StoreBase): class StoreLegimiStore(StoreBase):
name = 'Legimi' name = 'Legimi'
author = u'Tomasz Długosz' author = u'Tomasz Długosz'
@ -1649,6 +1661,7 @@ class StoreWoblinkStore(StoreBase):
headquarters = 'PL' headquarters = 'PL'
formats = ['EPUB', 'MOBI', 'PDF', 'WOBLINK'] formats = ['EPUB', 'MOBI', 'PDF', 'WOBLINK']
affiliate = True
class XinXiiStore(StoreBase): class XinXiiStore(StoreBase):
name = 'XinXii' name = 'XinXii'
@ -1686,6 +1699,7 @@ plugins += [
StoreGoogleBooksStore, StoreGoogleBooksStore,
StoreGutenbergStore, StoreGutenbergStore,
StoreKoboStore, StoreKoboStore,
StoreKoobeStore,
StoreLegimiStore, StoreLegimiStore,
StoreLibreDEStore, StoreLibreDEStore,
StoreLitResStore, StoreLitResStore,

View File

@ -91,7 +91,7 @@ def restore_plugin_state_to_default(plugin_or_name):
config['enabled_plugins'] = ep config['enabled_plugins'] = ep
default_disabled_plugins = set([ default_disabled_plugins = set([
'Overdrive', 'Douban Books', 'OZON.ru', 'Edelweiss', 'Google Images', 'Overdrive', 'Douban Books', 'OZON.ru', 'Edelweiss', 'Google Images', 'Big Book Search',
]) ])
def is_disabled(plugin): def is_disabled(plugin):

View File

@ -68,4 +68,5 @@ Various things that require other things before they can be migrated:
libraries/switching/on calibre startup. libraries/switching/on calibre startup.
3. From refresh in the legacy interface: Rember to flush the composite 3. From refresh in the legacy interface: Rember to flush the composite
column template cache. column template cache.
4. Replace the metadatabackup thread with the new implementation when using the new backend.
''' '''

View File

@ -41,8 +41,7 @@ Differences in semantics from pysqlite:
''' '''
class DynamicFilter(object): # {{{
class DynamicFilter(object): # {{{
'No longer used, present for legacy compatibility' 'No longer used, present for legacy compatibility'
@ -57,7 +56,7 @@ class DynamicFilter(object): # {{{
self.ids = frozenset(ids) self.ids = frozenset(ids)
# }}} # }}}
class DBPrefs(dict): # {{{ class DBPrefs(dict): # {{{
'Store preferences as key:value pairs in the db' 'Store preferences as key:value pairs in the db'
@ -114,9 +113,10 @@ class DBPrefs(dict): # {{{
return default return default
def set_namespaced(self, namespace, key, val): def set_namespaced(self, namespace, key, val):
if u':' in key: raise KeyError('Colons are not allowed in keys') if u':' in key:
if u':' in namespace: raise KeyError('Colons are not allowed in' raise KeyError('Colons are not allowed in keys')
' the namespace') if u':' in namespace:
raise KeyError('Colons are not allowed in the namespace')
key = u'namespaced:%s:%s'%(namespace, key) key = u'namespaced:%s:%s'%(namespace, key)
self[key] = val self[key] = val
@ -170,7 +170,8 @@ def pynocase(one, two, encoding='utf-8'):
return cmp(one.lower(), two.lower()) return cmp(one.lower(), two.lower())
def _author_to_author_sort(x): def _author_to_author_sort(x):
if not x: return '' if not x:
return ''
return author_to_author_sort(x.replace('|', ',')) return author_to_author_sort(x.replace('|', ','))
def icu_collator(s1, s2): def icu_collator(s1, s2):
@ -239,9 +240,9 @@ def AumSortedConcatenate():
# }}} # }}}
class Connection(apsw.Connection): # {{{ class Connection(apsw.Connection): # {{{
BUSY_TIMEOUT = 2000 # milliseconds BUSY_TIMEOUT = 2000 # milliseconds
def __init__(self, path): def __init__(self, path):
apsw.Connection.__init__(self, path) apsw.Connection.__init__(self, path)
@ -257,7 +258,7 @@ class Connection(apsw.Connection): # {{{
self.createscalarfunction('title_sort', title_sort, 1) self.createscalarfunction('title_sort', title_sort, 1)
self.createscalarfunction('author_to_author_sort', self.createscalarfunction('author_to_author_sort',
_author_to_author_sort, 1) _author_to_author_sort, 1)
self.createscalarfunction('uuid4', lambda : str(uuid.uuid4()), self.createscalarfunction('uuid4', lambda: str(uuid.uuid4()),
0) 0)
# Dummy functions for dynamically created filters # Dummy functions for dynamically created filters
@ -380,7 +381,7 @@ class DB(object):
self.initialize_custom_columns() self.initialize_custom_columns()
self.initialize_tables() self.initialize_tables()
def initialize_prefs(self, default_prefs): # {{{ def initialize_prefs(self, default_prefs): # {{{
self.prefs = DBPrefs(self) self.prefs = DBPrefs(self)
if default_prefs is not None and not self._exists: if default_prefs is not None and not self._exists:
@ -493,7 +494,7 @@ class DB(object):
self.prefs.set('user_categories', user_cats) self.prefs.set('user_categories', user_cats)
# }}} # }}}
def initialize_custom_columns(self): # {{{ def initialize_custom_columns(self): # {{{
with self.conn: with self.conn:
# Delete previously marked custom columns # Delete previously marked custom columns
for record in self.conn.get( for record in self.conn.get(
@ -634,11 +635,11 @@ class DB(object):
self.custom_data_adapters = { self.custom_data_adapters = {
'float': adapt_number, 'float': adapt_number,
'int': adapt_number, 'int': adapt_number,
'rating':lambda x,d : x if x is None else min(10., max(0., float(x))), 'rating':lambda x,d: x if x is None else min(10., max(0., float(x))),
'bool': adapt_bool, 'bool': adapt_bool,
'comments': lambda x,d: adapt_text(x, {'is_multiple':False}), 'comments': lambda x,d: adapt_text(x, {'is_multiple':False}),
'datetime' : adapt_datetime, 'datetime': adapt_datetime,
'text':adapt_text, 'text':adapt_text,
'series':adapt_text, 'series':adapt_text,
'enumeration': adapt_enum 'enumeration': adapt_enum
@ -661,7 +662,7 @@ class DB(object):
# }}} # }}}
def initialize_tables(self): # {{{ def initialize_tables(self): # {{{
tables = self.tables = {} tables = self.tables = {}
for col in ('title', 'sort', 'author_sort', 'series_index', 'comments', for col in ('title', 'sort', 'author_sort', 'series_index', 'comments',
'timestamp', 'pubdate', 'uuid', 'path', 'cover', 'timestamp', 'pubdate', 'uuid', 'path', 'cover',
@ -866,8 +867,8 @@ class DB(object):
Read all data from the db into the python in-memory tables Read all data from the db into the python in-memory tables
''' '''
with self.conn: # Use a single transaction, to ensure nothing modifies with self.conn: # Use a single transaction, to ensure nothing modifies
# the db while we are reading # the db while we are reading
for table in self.tables.itervalues(): for table in self.tables.itervalues():
try: try:
table.read(self) table.read(self)
@ -885,7 +886,7 @@ class DB(object):
return fmt_path return fmt_path
try: try:
candidates = glob.glob(os.path.join(path, '*'+fmt)) candidates = glob.glob(os.path.join(path, '*'+fmt))
except: # If path contains strange characters this throws an exc except: # If path contains strange characters this throws an exc
candidates = [] candidates = []
if fmt and candidates and os.path.exists(candidates[0]): if fmt and candidates and os.path.exists(candidates[0]):
shutil.copyfile(candidates[0], fmt_path) shutil.copyfile(candidates[0], fmt_path)
@ -954,7 +955,7 @@ class DB(object):
if path != dest: if path != dest:
os.rename(path, dest) os.rename(path, dest)
except: except:
pass # Nothing too catastrophic happened, the cases mismatch, that's all pass # Nothing too catastrophic happened, the cases mismatch, that's all
else: else:
windows_atomic_move.copy_path_to(path, dest) windows_atomic_move.copy_path_to(path, dest)
else: else:
@ -970,7 +971,7 @@ class DB(object):
try: try:
os.rename(path, dest) os.rename(path, dest)
except: except:
pass # Nothing too catastrophic happened, the cases mismatch, that's all pass # Nothing too catastrophic happened, the cases mismatch, that's all
else: else:
if use_hardlink: if use_hardlink:
try: try:
@ -1021,7 +1022,7 @@ class DB(object):
if not os.path.exists(tpath): if not os.path.exists(tpath):
os.makedirs(tpath) os.makedirs(tpath)
if source_ok: # Migrate existing files if source_ok: # Migrate existing files
dest = os.path.join(tpath, 'cover.jpg') dest = os.path.join(tpath, 'cover.jpg')
self.copy_cover_to(current_path, dest, self.copy_cover_to(current_path, dest,
windows_atomic_move=wam, use_hardlink=True) windows_atomic_move=wam, use_hardlink=True)
@ -1064,8 +1065,18 @@ class DB(object):
os.rename(os.path.join(curpath, oldseg), os.rename(os.path.join(curpath, oldseg),
os.path.join(curpath, newseg)) os.path.join(curpath, newseg))
except: except:
break # Fail silently since nothing catastrophic has happened break # Fail silently since nothing catastrophic has happened
curpath = os.path.join(curpath, newseg) curpath = os.path.join(curpath, newseg)
def write_backup(self, path, raw):
path = os.path.abspath(os.path.join(self.library_path, path, 'metadata.opf'))
with lopen(path, 'wb') as f:
f.write(raw)
def read_backup(self, path):
path = os.path.abspath(os.path.join(self.library_path, path, 'metadata.opf'))
with lopen(path, 'rb') as f:
return f.read()
# }}} # }}}

115
src/calibre/db/backup.py Normal file
View File

@ -0,0 +1,115 @@
#!/usr/bin/env python
# vim:fileencoding=UTF-8
from __future__ import (unicode_literals, division, absolute_import,
print_function)
__license__ = 'GPL v3'
__copyright__ = '2013, Kovid Goyal <kovid at kovidgoyal.net>'
__docformat__ = 'restructuredtext en'
import weakref, traceback
from threading import Thread, Event
from calibre import prints
from calibre.ebooks.metadata.opf2 import metadata_to_opf
class Abort(Exception):
pass
class MetadataBackup(Thread):
'''
Continuously backup changed metadata into OPF files
in the book directory. This class runs in its own
thread.
'''
def __init__(self, db, interval=2, scheduling_interval=0.1):
Thread.__init__(self)
self.daemon = True
self._db = weakref.ref(db)
self.stop_running = Event()
self.interval = interval
self.scheduling_interval = scheduling_interval
@property
def db(self):
ans = self._db()
if ans is None:
raise Abort()
return ans
def stop(self):
self.stop_running.set()
def wait(self, interval):
if self.stop_running.wait(interval):
raise Abort()
def run(self):
while not self.stop_running.is_set():
try:
self.wait(self.interval)
self.do_one()
except Abort:
break
def do_one(self):
try:
book_id = self.db.get_a_dirtied_book()
if book_id is None:
return
except Abort:
raise
except:
# Happens during interpreter shutdown
return
self.wait(0)
try:
mi, sequence = self.db.get_metadata_for_dump(book_id)
except:
prints('Failed to get backup metadata for id:', book_id, 'once')
traceback.print_exc()
self.wait(self.interval)
try:
mi, sequence = self.db.get_metadata_for_dump(book_id)
except:
prints('Failed to get backup metadata for id:', book_id, 'again, giving up')
traceback.print_exc()
return
if mi is None:
self.db.clear_dirtied(book_id, sequence)
# Give the GUI thread a chance to do something. Python threads don't
# have priorities, so this thread would naturally keep the processor
# until some scheduling event happens. The wait makes such an event
self.wait(self.scheduling_interval)
try:
raw = metadata_to_opf(mi)
except:
prints('Failed to convert to opf for id:', book_id)
traceback.print_exc()
return
self.wait(self.scheduling_interval)
try:
self.db.write_backup(book_id, raw)
except:
prints('Failed to write backup metadata for id:', book_id, 'once')
self.wait(self.interval)
try:
self.db.write_backup(book_id, raw)
except:
prints('Failed to write backup metadata for id:', book_id, 'again, giving up')
return
self.db.clear_dirtied(book_id, sequence)
def break_cycles(self):
# Legacy compatibility
pass

View File

@ -7,7 +7,7 @@ __license__ = 'GPL v3'
__copyright__ = '2011, Kovid Goyal <kovid@kovidgoyal.net>' __copyright__ = '2011, Kovid Goyal <kovid@kovidgoyal.net>'
__docformat__ = 'restructuredtext en' __docformat__ = 'restructuredtext en'
import os, traceback import os, traceback, random
from io import BytesIO from io import BytesIO
from collections import defaultdict from collections import defaultdict
from functools import wraps, partial from functools import wraps, partial
@ -15,7 +15,7 @@ from functools import wraps, partial
from calibre.constants import iswindows from calibre.constants import iswindows
from calibre.db import SPOOL_SIZE from calibre.db import SPOOL_SIZE
from calibre.db.categories import get_categories from calibre.db.categories import get_categories
from calibre.db.locking import create_locks, RecordLock from calibre.db.locking import create_locks
from calibre.db.errors import NoSuchFormat from calibre.db.errors import NoSuchFormat
from calibre.db.fields import create_field from calibre.db.fields import create_field
from calibre.db.search import Search from calibre.db.search import Search
@ -23,9 +23,10 @@ from calibre.db.tables import VirtualTable
from calibre.db.write import get_series_values from calibre.db.write import get_series_values
from calibre.db.lazy import FormatMetadata, FormatsList from calibre.db.lazy import FormatMetadata, FormatsList
from calibre.ebooks.metadata.book.base import Metadata from calibre.ebooks.metadata.book.base import Metadata
from calibre.ebooks.metadata.opf2 import metadata_to_opf
from calibre.ptempfile import (base_dir, PersistentTemporaryFile, from calibre.ptempfile import (base_dir, PersistentTemporaryFile,
SpooledTemporaryFile) SpooledTemporaryFile)
from calibre.utils.date import now from calibre.utils.date import now as nowf
from calibre.utils.icu import sort_key from calibre.utils.icu import sort_key
def api(f): def api(f):
@ -57,9 +58,10 @@ class Cache(object):
self.fields = {} self.fields = {}
self.composites = set() self.composites = set()
self.read_lock, self.write_lock = create_locks() self.read_lock, self.write_lock = create_locks()
self.record_lock = RecordLock(self.read_lock)
self.format_metadata_cache = defaultdict(dict) self.format_metadata_cache = defaultdict(dict)
self.formatter_template_cache = {} self.formatter_template_cache = {}
self.dirtied_cache = {}
self.dirtied_sequence = 0
self._search_api = Search(self.field_metadata.get_search_terms()) self._search_api = Search(self.field_metadata.get_search_terms())
# Implement locking for all simple read/write API methods # Implement locking for all simple read/write API methods
@ -78,17 +80,18 @@ class Cache(object):
self.initialize_dynamic() self.initialize_dynamic()
@write_api
def initialize_dynamic(self): def initialize_dynamic(self):
# Reconstruct the user categories, putting them into field_metadata # Reconstruct the user categories, putting them into field_metadata
# Assumption is that someone else will fix them if they change. # Assumption is that someone else will fix them if they change.
self.field_metadata.remove_dynamic_categories() self.field_metadata.remove_dynamic_categories()
for user_cat in sorted(self.pref('user_categories', {}).iterkeys(), key=sort_key): for user_cat in sorted(self._pref('user_categories', {}).iterkeys(), key=sort_key):
cat_name = '@' + user_cat # add the '@' to avoid name collision cat_name = '@' + user_cat # add the '@' to avoid name collision
self.field_metadata.add_user_category(label=cat_name, name=user_cat) self.field_metadata.add_user_category(label=cat_name, name=user_cat)
# add grouped search term user categories # add grouped search term user categories
muc = frozenset(self.pref('grouped_search_make_user_categories', [])) muc = frozenset(self._pref('grouped_search_make_user_categories', []))
for cat in sorted(self.pref('grouped_search_terms', {}).iterkeys(), key=sort_key): for cat in sorted(self._pref('grouped_search_terms', {}).iterkeys(), key=sort_key):
if cat in muc: if cat in muc:
# There is a chance that these can be duplicates of an existing # There is a chance that these can be duplicates of an existing
# user category. Print the exception and continue. # user category. Print the exception and continue.
@ -102,15 +105,20 @@ class Cache(object):
# self.field_metadata.add_search_category(label='search', name=_('Searches')) # self.field_metadata.add_search_category(label='search', name=_('Searches'))
self.field_metadata.add_grouped_search_terms( self.field_metadata.add_grouped_search_terms(
self.pref('grouped_search_terms', {})) self._pref('grouped_search_terms', {}))
self._search_api.change_locations(self.field_metadata.get_search_terms()) self._search_api.change_locations(self.field_metadata.get_search_terms())
self.dirtied_cache = {x:i for i, (x,) in enumerate(
self.backend.conn.execute('SELECT book FROM metadata_dirtied'))}
if self.dirtied_cache:
self.dirtied_sequence = max(self.dirtied_cache.itervalues())+1
@property @property
def field_metadata(self): def field_metadata(self):
return self.backend.field_metadata return self.backend.field_metadata
def _get_metadata(self, book_id, get_user_categories=True): # {{{ def _get_metadata(self, book_id, get_user_categories=True): # {{{
mi = Metadata(None, template_cache=self.formatter_template_cache) mi = Metadata(None, template_cache=self.formatter_template_cache)
author_ids = self._field_ids_for('authors', book_id) author_ids = self._field_ids_for('authors', book_id)
aut_list = [self._author_data(i) for i in author_ids] aut_list = [self._author_data(i) for i in author_ids]
@ -131,7 +139,7 @@ class Cache(object):
mi.author_link_map = aul mi.author_link_map = aul
mi.comments = self._field_for('comments', book_id) mi.comments = self._field_for('comments', book_id)
mi.publisher = self._field_for('publisher', book_id) mi.publisher = self._field_for('publisher', book_id)
n = now() n = nowf()
mi.timestamp = self._field_for('timestamp', book_id, default_value=n) mi.timestamp = self._field_for('timestamp', book_id, default_value=n)
mi.pubdate = self._field_for('pubdate', book_id, default_value=n) mi.pubdate = self._field_for('pubdate', book_id, default_value=n)
mi.uuid = self._field_for('uuid', book_id, mi.uuid = self._field_for('uuid', book_id,
@ -395,16 +403,19 @@ class Cache(object):
''' '''
if as_file: if as_file:
ret = SpooledTemporaryFile(SPOOL_SIZE) ret = SpooledTemporaryFile(SPOOL_SIZE)
if not self.copy_cover_to(book_id, ret): return if not self.copy_cover_to(book_id, ret):
return
ret.seek(0) ret.seek(0)
elif as_path: elif as_path:
pt = PersistentTemporaryFile('_dbcover.jpg') pt = PersistentTemporaryFile('_dbcover.jpg')
with pt: with pt:
if not self.copy_cover_to(book_id, pt): return if not self.copy_cover_to(book_id, pt):
return
ret = pt.name ret = pt.name
else: else:
buf = BytesIO() buf = BytesIO()
if not self.copy_cover_to(book_id, buf): return if not self.copy_cover_to(book_id, buf):
return
ret = buf.getvalue() ret = buf.getvalue()
if as_image: if as_image:
from PyQt4.Qt import QImage from PyQt4.Qt import QImage
@ -413,7 +424,7 @@ class Cache(object):
ret = i ret = i
return ret return ret
@api @read_api
def copy_cover_to(self, book_id, dest, use_hardlink=False): def copy_cover_to(self, book_id, dest, use_hardlink=False):
''' '''
Copy the cover to the file like object ``dest``. Returns False Copy the cover to the file like object ``dest``. Returns False
@ -422,17 +433,15 @@ class Cache(object):
copied to it iff the path is different from the current path (taking copied to it iff the path is different from the current path (taking
case sensitivity into account). case sensitivity into account).
''' '''
with self.read_lock: try:
try: path = self._field_for('path', book_id).replace('/', os.sep)
path = self._field_for('path', book_id).replace('/', os.sep) except AttributeError:
except: return False
return False
with self.record_lock.lock(book_id): return self.backend.copy_cover_to(path, dest,
return self.backend.copy_cover_to(path, dest,
use_hardlink=use_hardlink) use_hardlink=use_hardlink)
@api @read_api
def copy_format_to(self, book_id, fmt, dest, use_hardlink=False): def copy_format_to(self, book_id, fmt, dest, use_hardlink=False):
''' '''
Copy the format ``fmt`` to the file like object ``dest``. If the Copy the format ``fmt`` to the file like object ``dest``. If the
@ -441,15 +450,13 @@ class Cache(object):
the path is different from the current path (taking case sensitivity the path is different from the current path (taking case sensitivity
into account). into account).
''' '''
with self.read_lock: try:
try: name = self.fields['formats'].format_fname(book_id, fmt)
name = self.fields['formats'].format_fname(book_id, fmt) path = self._field_for('path', book_id).replace('/', os.sep)
path = self._field_for('path', book_id).replace('/', os.sep) except (KeyError, AttributeError):
except: raise NoSuchFormat('Record %d has no %s file'%(book_id, fmt))
raise NoSuchFormat('Record %d has no %s file'%(book_id, fmt))
with self.record_lock.lock(book_id): return self.backend.copy_format_to(book_id, fmt, name, path, dest,
return self.backend.copy_format_to(book_id, fmt, name, path, dest,
use_hardlink=use_hardlink) use_hardlink=use_hardlink)
@read_api @read_api
@ -520,16 +527,16 @@ class Cache(object):
this means that repeated calls yield the same this means that repeated calls yield the same
temp file (which is re-created each time) temp file (which is re-created each time)
''' '''
with self.read_lock: ext = ('.'+fmt.lower()) if fmt else ''
ext = ('.'+fmt.lower()) if fmt else ''
try:
fname = self.fields['formats'].format_fname(book_id, fmt)
except:
return None
fname += ext
if as_path: if as_path:
if preserve_filename: if preserve_filename:
with self.read_lock:
try:
fname = self.fields['formats'].format_fname(book_id, fmt)
except:
return None
fname += ext
bd = base_dir() bd = base_dir()
d = os.path.join(bd, 'format_abspath') d = os.path.join(bd, 'format_abspath')
try: try:
@ -537,36 +544,40 @@ class Cache(object):
except: except:
pass pass
ret = os.path.join(d, fname) ret = os.path.join(d, fname)
with self.record_lock.lock(book_id): try:
try: self.copy_format_to(book_id, fmt, ret)
self.copy_format_to(book_id, fmt, ret) except NoSuchFormat:
except NoSuchFormat: return None
return None
else: else:
with PersistentTemporaryFile(ext) as pt, self.record_lock.lock(book_id): with PersistentTemporaryFile(ext) as pt:
try: try:
self.copy_format_to(book_id, fmt, pt) self.copy_format_to(book_id, fmt, pt)
except NoSuchFormat: except NoSuchFormat:
return None return None
ret = pt.name ret = pt.name
elif as_file: elif as_file:
ret = SpooledTemporaryFile(SPOOL_SIZE) with self.read_lock:
with self.record_lock.lock(book_id):
try: try:
self.copy_format_to(book_id, fmt, ret) fname = self.fields['formats'].format_fname(book_id, fmt)
except NoSuchFormat: except:
return None return None
fname += ext
ret = SpooledTemporaryFile(SPOOL_SIZE)
try:
self.copy_format_to(book_id, fmt, ret)
except NoSuchFormat:
return None
ret.seek(0) ret.seek(0)
# Various bits of code try to use the name as the default # Various bits of code try to use the name as the default
# title when reading metadata, so set it # title when reading metadata, so set it
ret.name = fname ret.name = fname
else: else:
buf = BytesIO() buf = BytesIO()
with self.record_lock.lock(book_id): try:
try: self.copy_format_to(book_id, fmt, buf)
self.copy_format_to(book_id, fmt, buf) except NoSuchFormat:
except NoSuchFormat: return None
return None
ret = buf.getvalue() ret = buf.getvalue()
@ -620,6 +631,30 @@ class Cache(object):
return get_categories(self, sort=sort, book_ids=book_ids, return get_categories(self, sort=sort, book_ids=book_ids,
icon_map=icon_map) icon_map=icon_map)
@write_api
def update_last_modified(self, book_ids, now=None):
if now is None:
now = nowf()
if book_ids:
f = self.fields['last_modified']
f.writer.set_books({book_id:now for book_id in book_ids}, self.backend)
@write_api
def mark_as_dirty(self, book_ids):
self._update_last_modified(book_ids)
already_dirtied = set(self.dirtied_cache).intersection(book_ids)
new_dirtied = book_ids - already_dirtied
already_dirtied = {book_id:self.dirtied_sequence+i for i, book_id in enumerate(already_dirtied)}
if already_dirtied:
self.dirtied_sequence = max(already_dirtied.itervalues()) + 1
self.dirtied_cache.update(already_dirtied)
if new_dirtied:
self.backend.conn.executemany('INSERT OR IGNORE INTO metadata_dirtied (book) VALUES (?)',
((x,) for x in new_dirtied))
new_dirtied = {book_id:self.dirtied_sequence+i for i, book_id in enumerate(new_dirtied)}
self.dirtied_sequence = max(new_dirtied.itervalues()) + 1
self.dirtied_cache.update(new_dirtied)
@write_api @write_api
def set_field(self, name, book_id_to_val_map, allow_case_change=True): def set_field(self, name, book_id_to_val_map, allow_case_change=True):
f = self.fields[name] f = self.fields[name]
@ -637,7 +672,7 @@ class Cache(object):
else: else:
v = sid = None v = sid = None
if name.startswith('#') and sid is None: if name.startswith('#') and sid is None:
sid = 1.0 # The value will be set to 1.0 in the db table sid = 1.0 # The value will be set to 1.0 in the db table
bimap[k] = v bimap[k] = v
if sid is not None: if sid is not None:
simap[k] = sid simap[k] = sid
@ -657,7 +692,7 @@ class Cache(object):
if dirtied and update_path: if dirtied and update_path:
self._update_path(dirtied, mark_as_dirtied=False) self._update_path(dirtied, mark_as_dirtied=False)
# TODO: Mark these as dirtied so that the opf is regenerated self._mark_as_dirty(dirtied)
return dirtied return dirtied
@ -668,13 +703,115 @@ class Cache(object):
author = self._field_for('authors', book_id, default_value=(_('Unknown'),))[0] author = self._field_for('authors', book_id, default_value=(_('Unknown'),))[0]
self.backend.update_path(book_id, title, author, self.fields['path'], self.fields['formats']) self.backend.update_path(book_id, title, author, self.fields['path'], self.fields['formats'])
if mark_as_dirtied: if mark_as_dirtied:
self._mark_as_dirty(book_ids)
@read_api
def get_a_dirtied_book(self):
if self.dirtied_cache:
return random.choice(tuple(self.dirtied_cache.iterkeys()))
return None
@read_api
def get_metadata_for_dump(self, book_id):
mi = None
# get the current sequence number for this book to pass back to the
# backup thread. This will avoid double calls in the case where the
# thread has not done the work between the put and the get_metadata
sequence = self.dirtied_cache.get(book_id, None)
if sequence is not None:
try:
# While a book is being created, the path is empty. Don't bother to
# try to write the opf, because it will go to the wrong folder.
if self._field_for('path', book_id):
mi = self._get_metadata(book_id)
# Always set cover to cover.jpg. Even if cover doesn't exist,
# no harm done. This way no need to call dirtied when
# cover is set/removed
mi.cover = 'cover.jpg'
except:
# This almost certainly means that the book has been deleted while
# the backup operation sat in the queue.
pass pass
# TODO: Mark these books as dirtied so that metadata.opf is return mi, sequence
# re-created
@write_api
def clear_dirtied(self, book_id, sequence):
'''
Clear the dirtied indicator for the books. This is used when fetching
metadata, creating an OPF, and writing a file are separated into steps.
The last step is clearing the indicator
'''
dc_sequence = self.dirtied_cache.get(book_id, None)
if dc_sequence is None or sequence is None or dc_sequence == sequence:
self.backend.conn.execute('DELETE FROM metadata_dirtied WHERE book=?',
(book_id,))
self.dirtied_cache.pop(book_id, None)
@write_api
def write_backup(self, book_id, raw):
try:
path = self._field_for('path', book_id).replace('/', os.sep)
except:
return
self.backend.write_backup(path, raw)
@read_api
def dirty_queue_length(self):
return len(self.dirtied_cache)
@read_api
def read_backup(self, book_id):
''' Return the OPF metadata backup for the book as a bytestring or None
if no such backup exists. '''
try:
path = self._field_for('path', book_id).replace('/', os.sep)
except:
return
try:
return self.backend.read_backup(path)
except EnvironmentError:
return None
@write_api
def dump_metadata(self, book_ids=None, remove_from_dirtied=True,
callback=None):
'''
Write metadata for each record to an individual OPF file. If callback
is not None, it is called once at the start with the number of book_ids
being processed. And once for every book_id, with arguments (book_id,
mi, ok).
'''
if book_ids is None:
book_ids = set(self.dirtied_cache)
if callback is not None:
callback(len(book_ids), True, False)
for book_id in book_ids:
if self._field_for('path', book_id) is None:
if callback is not None:
callback(book_id, None, False)
continue
mi, sequence = self._get_metadata_for_dump(book_id)
if mi is None:
if callback is not None:
callback(book_id, mi, False)
continue
try:
raw = metadata_to_opf(mi)
self._write_backup(book_id, raw)
if remove_from_dirtied:
self._clear_dirtied(book_id, sequence)
except:
pass
if callback is not None:
callback(book_id, mi, True)
# }}} # }}}
class SortKey(object): # {{{ class SortKey(object): # {{{
def __init__(self, fields, sort_keys, book_id): def __init__(self, fields, sort_keys, book_id):
self.orders = tuple(1 if f[1] else -1 for f in fields) self.orders = tuple(1 if f[1] else -1 for f in fields)

View File

@ -18,7 +18,7 @@ from calibre.utils.config_base import tweaks
from calibre.utils.icu import sort_key from calibre.utils.icu import sort_key
from calibre.utils.search_query_parser import saved_searches from calibre.utils.search_query_parser import saved_searches
CATEGORY_SORTS = ('name', 'popularity', 'rating') # This has to be a tuple not a set CATEGORY_SORTS = ('name', 'popularity', 'rating') # This has to be a tuple not a set
class Tag(object): class Tag(object):
@ -218,7 +218,7 @@ def get_categories(dbcache, sort='name', book_ids=None, icon_map=None):
else: else:
items.append(taglist[label][n]) items.append(taglist[label][n])
# else: do nothing, to not include nodes w zero counts # else: do nothing, to not include nodes w zero counts
cat_name = '@' + user_cat # add the '@' to avoid name collision cat_name = '@' + user_cat # add the '@' to avoid name collision
# Not a problem if we accumulate entries in the icon map # Not a problem if we accumulate entries in the icon map
if icon_map is not None: if icon_map is not None:
icon_map[cat_name] = icon_map['user:'] icon_map[cat_name] = icon_map['user:']

View File

@ -31,7 +31,7 @@ class Field(object):
self.table_type = self.table.table_type self.table_type = self.table.table_type
self._sort_key = (sort_key if dt in ('text', 'series', 'enumeration') else lambda x: x) self._sort_key = (sort_key if dt in ('text', 'series', 'enumeration') else lambda x: x)
self._default_sort_key = '' self._default_sort_key = ''
if dt in { 'int', 'float', 'rating' }: if dt in {'int', 'float', 'rating'}:
self._default_sort_key = 0 self._default_sort_key = 0
elif dt == 'bool': elif dt == 'bool':
self._default_sort_key = None self._default_sort_key = None
@ -138,7 +138,7 @@ class OneToOneField(Field):
return self.table.book_col_map.iterkeys() return self.table.book_col_map.iterkeys()
def sort_keys_for_books(self, get_metadata, lang_map, all_book_ids): def sort_keys_for_books(self, get_metadata, lang_map, all_book_ids):
return {id_ : self._sort_key(self.table.book_col_map.get(id_, return {id_: self._sort_key(self.table.book_col_map.get(id_,
self._default_sort_key)) for id_ in all_book_ids} self._default_sort_key)) for id_ in all_book_ids}
def iter_searchable_values(self, get_metadata, candidates, default_value=None): def iter_searchable_values(self, get_metadata, candidates, default_value=None):
@ -183,7 +183,7 @@ class CompositeField(OneToOneField):
return ans return ans
def sort_keys_for_books(self, get_metadata, lang_map, all_book_ids): def sort_keys_for_books(self, get_metadata, lang_map, all_book_ids):
return {id_ : sort_key(self.get_value_with_cache(id_, get_metadata)) for id_ in return {id_: sort_key(self.get_value_with_cache(id_, get_metadata)) for id_ in
all_book_ids} all_book_ids}
def iter_searchable_values(self, get_metadata, candidates, default_value=None): def iter_searchable_values(self, get_metadata, candidates, default_value=None):
@ -245,7 +245,7 @@ class OnDeviceField(OneToOneField):
return iter(()) return iter(())
def sort_keys_for_books(self, get_metadata, lang_map, all_book_ids): def sort_keys_for_books(self, get_metadata, lang_map, all_book_ids):
return {id_ : self.for_book(id_) for id_ in return {id_: self.for_book(id_) for id_ in
all_book_ids} all_book_ids}
def iter_searchable_values(self, get_metadata, candidates, default_value=None): def iter_searchable_values(self, get_metadata, candidates, default_value=None):
@ -280,12 +280,12 @@ class ManyToOneField(Field):
return self.table.id_map.iterkeys() return self.table.id_map.iterkeys()
def sort_keys_for_books(self, get_metadata, lang_map, all_book_ids): def sort_keys_for_books(self, get_metadata, lang_map, all_book_ids):
ans = {id_ : self.table.book_col_map.get(id_, None) ans = {id_: self.table.book_col_map.get(id_, None)
for id_ in all_book_ids} for id_ in all_book_ids}
sk_map = {cid : (self._default_sort_key if cid is None else sk_map = {cid: (self._default_sort_key if cid is None else
self._sort_key(self.table.id_map[cid])) self._sort_key(self.table.id_map[cid]))
for cid in ans.itervalues()} for cid in ans.itervalues()}
return {id_ : sk_map[cid] for id_, cid in ans.iteritems()} return {id_: sk_map[cid] for id_, cid in ans.iteritems()}
def iter_searchable_values(self, get_metadata, candidates, default_value=None): def iter_searchable_values(self, get_metadata, candidates, default_value=None):
cbm = self.table.col_book_map cbm = self.table.col_book_map
@ -327,14 +327,14 @@ class ManyToManyField(Field):
return self.table.id_map.iterkeys() return self.table.id_map.iterkeys()
def sort_keys_for_books(self, get_metadata, lang_map, all_book_ids): def sort_keys_for_books(self, get_metadata, lang_map, all_book_ids):
ans = {id_ : self.table.book_col_map.get(id_, ()) ans = {id_: self.table.book_col_map.get(id_, ())
for id_ in all_book_ids} for id_ in all_book_ids}
all_cids = set() all_cids = set()
for cids in ans.itervalues(): for cids in ans.itervalues():
all_cids = all_cids.union(set(cids)) all_cids = all_cids.union(set(cids))
sk_map = {cid : self._sort_key(self.table.id_map[cid]) sk_map = {cid: self._sort_key(self.table.id_map[cid])
for cid in all_cids} for cid in all_cids}
return {id_ : (tuple(sk_map[cid] for cid in cids) if cids else return {id_: (tuple(sk_map[cid] for cid in cids) if cids else
(self._default_sort_key,)) (self._default_sort_key,))
for id_, cids in ans.iteritems()} for id_, cids in ans.iteritems()}
@ -369,9 +369,9 @@ class IdentifiersField(ManyToManyField):
def sort_keys_for_books(self, get_metadata, lang_map, all_book_ids): def sort_keys_for_books(self, get_metadata, lang_map, all_book_ids):
'Sort by identifier keys' 'Sort by identifier keys'
ans = {id_ : self.table.book_col_map.get(id_, ()) ans = {id_: self.table.book_col_map.get(id_, ())
for id_ in all_book_ids} for id_ in all_book_ids}
return {id_ : (tuple(sorted(cids.iterkeys())) if cids else return {id_: (tuple(sorted(cids.iterkeys())) if cids else
(self._default_sort_key,)) (self._default_sort_key,))
for id_, cids in ans.iteritems()} for id_, cids in ans.iteritems()}
@ -397,9 +397,9 @@ class AuthorsField(ManyToManyField):
def author_data(self, author_id): def author_data(self, author_id):
return { return {
'name' : self.table.id_map[author_id], 'name': self.table.id_map[author_id],
'sort' : self.table.asort_map[author_id], 'sort': self.table.asort_map[author_id],
'link' : self.table.alink_map[author_id], 'link': self.table.alink_map[author_id],
} }
def category_sort_value(self, item_id, book_ids, lang_map): def category_sort_value(self, item_id, book_ids, lang_map):
@ -505,9 +505,9 @@ class TagsField(ManyToManyField):
def create_field(name, table): def create_field(name, table):
cls = { cls = {
ONE_ONE : OneToOneField, ONE_ONE: OneToOneField,
MANY_ONE : ManyToOneField, MANY_ONE: ManyToOneField,
MANY_MANY : ManyToManyField, MANY_MANY: ManyToManyField,
}[table.table_type] }[table.table_type]
if name == 'authors': if name == 'authors':
cls = AuthorsField cls = AuthorsField

View File

@ -39,7 +39,7 @@ def create_locks():
l = SHLock() l = SHLock()
return RWLockWrapper(l), RWLockWrapper(l, is_shared=False) return RWLockWrapper(l), RWLockWrapper(l, is_shared=False)
class SHLock(object): # {{{ class SHLock(object): # {{{
''' '''
Shareable lock class. Used to implement the Multiple readers-single writer Shareable lock class. Used to implement the Multiple readers-single writer
paradigm. As best as I can tell, neither writer nor reader starvation paradigm. As best as I can tell, neither writer nor reader starvation
@ -191,7 +191,7 @@ class SHLock(object): # {{{
try: try:
return self._free_waiters.pop() return self._free_waiters.pop()
except IndexError: except IndexError:
return Condition(self._lock)#, verbose=True) return Condition(self._lock)
def _return_waiter(self, waiter): def _return_waiter(self, waiter):
self._free_waiters.append(waiter) self._free_waiters.append(waiter)

View File

@ -172,7 +172,6 @@ class SchemaUpgrade(object):
''' '''
) )
def upgrade_version_6(self): def upgrade_version_6(self):
'Show authors in order' 'Show authors in order'
self.conn.execute(''' self.conn.execute('''
@ -337,7 +336,7 @@ class SchemaUpgrade(object):
FROM {tn}; FROM {tn};
'''.format(tn=table_name, cn=column_name, '''.format(tn=table_name, cn=column_name,
vcn=view_column_name, scn= sort_column_name)) vcn=view_column_name, scn=sort_column_name))
self.conn.execute(script) self.conn.execute(script)
def create_cust_tag_browser_view(table_name, link_table_name): def create_cust_tag_browser_view(table_name, link_table_name):

View File

@ -64,7 +64,7 @@ def _match(query, value, matchkind, use_primary_find_in_search=True):
else: else:
internal_match_ok = False internal_match_ok = False
for t in value: for t in value:
try: ### ignore regexp exceptions, required because search-ahead tries before typing is finished try: # ignore regexp exceptions, required because search-ahead tries before typing is finished
t = icu_lower(t) t = icu_lower(t)
if (matchkind == EQUALS_MATCH): if (matchkind == EQUALS_MATCH):
if internal_match_ok: if internal_match_ok:
@ -95,20 +95,20 @@ def _match(query, value, matchkind, use_primary_find_in_search=True):
return False return False
# }}} # }}}
class DateSearch(object): # {{{ class DateSearch(object): # {{{
def __init__(self): def __init__(self):
self.operators = { self.operators = {
'=' : (1, self.eq), '=': (1, self.eq),
'!=' : (2, self.ne), '!=': (2, self.ne),
'>' : (1, self.gt), '>': (1, self.gt),
'>=' : (2, self.ge), '>=': (2, self.ge),
'<' : (1, self.lt), '<': (1, self.lt),
'<=' : (2, self.le), '<=': (2, self.le),
} }
self.local_today = { '_today', 'today', icu_lower(_('today')) } self.local_today = {'_today', 'today', icu_lower(_('today'))}
self.local_yesterday = { '_yesterday', 'yesterday', icu_lower(_('yesterday')) } self.local_yesterday = {'_yesterday', 'yesterday', icu_lower(_('yesterday'))}
self.local_thismonth = { '_thismonth', 'thismonth', icu_lower(_('thismonth')) } self.local_thismonth = {'_thismonth', 'thismonth', icu_lower(_('thismonth'))}
self.daysago_pat = re.compile(r'(%s|daysago|_daysago)$'%_('daysago')) self.daysago_pat = re.compile(r'(%s|daysago|_daysago)$'%_('daysago'))
def eq(self, dbdate, query, field_count): def eq(self, dbdate, query, field_count):
@ -216,16 +216,16 @@ class DateSearch(object): # {{{
return matches return matches
# }}} # }}}
class NumericSearch(object): # {{{ class NumericSearch(object): # {{{
def __init__(self): def __init__(self):
self.operators = { self.operators = {
'=':( 1, lambda r, q: r == q ), '=':(1, lambda r, q: r == q),
'>':( 1, lambda r, q: r is not None and r > q ), '>':(1, lambda r, q: r is not None and r > q),
'<':( 1, lambda r, q: r is not None and r < q ), '<':(1, lambda r, q: r is not None and r < q),
'!=':( 2, lambda r, q: r != q ), '!=':(2, lambda r, q: r != q),
'>=':( 2, lambda r, q: r is not None and r >= q ), '>=':(2, lambda r, q: r is not None and r >= q),
'<=':( 2, lambda r, q: r is not None and r <= q ) '<=':(2, lambda r, q: r is not None and r <= q)
} }
def __call__(self, query, field_iter, location, datatype, candidates, is_many=False): def __call__(self, query, field_iter, location, datatype, candidates, is_many=False):
@ -267,7 +267,7 @@ class NumericSearch(object): # {{{
p, relop = self.operators['='] p, relop = self.operators['=']
cast = int cast = int
if dt == 'rating': if dt == 'rating':
cast = lambda x: 0 if x is None else int(x) cast = lambda x: 0 if x is None else int(x)
adjust = lambda x: x/2 adjust = lambda x: x/2
elif dt in ('float', 'composite'): elif dt in ('float', 'composite'):
@ -303,7 +303,7 @@ class NumericSearch(object): # {{{
# }}} # }}}
class BooleanSearch(object): # {{{ class BooleanSearch(object): # {{{
def __init__(self): def __init__(self):
self.local_no = icu_lower(_('no')) self.local_no = icu_lower(_('no'))
@ -324,27 +324,27 @@ class BooleanSearch(object): # {{{
for val, book_ids in field_iter(): for val, book_ids in field_iter():
val = force_to_bool(val) val = force_to_bool(val)
if not bools_are_tristate: if not bools_are_tristate:
if val is None or not val: # item is None or set to false if val is None or not val: # item is None or set to false
if query in { self.local_no, self.local_unchecked, 'no', '_no', 'false' }: if query in {self.local_no, self.local_unchecked, 'no', '_no', 'false'}:
matches |= book_ids matches |= book_ids
else: # item is explicitly set to true else: # item is explicitly set to true
if query in { self.local_yes, self.local_checked, 'yes', '_yes', 'true' }: if query in {self.local_yes, self.local_checked, 'yes', '_yes', 'true'}:
matches |= book_ids matches |= book_ids
else: else:
if val is None: if val is None:
if query in { self.local_empty, self.local_blank, 'empty', '_empty', 'false' }: if query in {self.local_empty, self.local_blank, 'empty', '_empty', 'false'}:
matches |= book_ids matches |= book_ids
elif not val: # is not None and false elif not val: # is not None and false
if query in { self.local_no, self.local_unchecked, 'no', '_no', 'true' }: if query in {self.local_no, self.local_unchecked, 'no', '_no', 'true'}:
matches |= book_ids matches |= book_ids
else: # item is not None and true else: # item is not None and true
if query in { self.local_yes, self.local_checked, 'yes', '_yes', 'true' }: if query in {self.local_yes, self.local_checked, 'yes', '_yes', 'true'}:
matches |= book_ids matches |= book_ids
return matches return matches
# }}} # }}}
class KeyPairSearch(object): # {{{ class KeyPairSearch(object): # {{{
def __call__(self, query, field_iter, candidates, use_primary_find): def __call__(self, query, field_iter, candidates, use_primary_find):
matches = set() matches = set()
@ -547,11 +547,12 @@ class Parser(SearchQueryParser):
field_metadata = {} field_metadata = {}
for x, fm in self.field_metadata.iteritems(): for x, fm in self.field_metadata.iteritems():
if x.startswith('@'): continue if x.startswith('@'):
continue
if fm['search_terms'] and x != 'series_sort': if fm['search_terms'] and x != 'series_sort':
all_locs.add(x) all_locs.add(x)
field_metadata[x] = fm field_metadata[x] = fm
if fm['datatype'] in { 'composite', 'text', 'comments', 'series', 'enumeration' }: if fm['datatype'] in {'composite', 'text', 'comments', 'series', 'enumeration'}:
text_fields.add(x) text_fields.add(x)
locations = all_locs if location == 'all' else {location} locations = all_locs if location == 'all' else {location}
@ -687,8 +688,8 @@ class Search(object):
dbcache, all_book_ids, dbcache.pref('grouped_search_terms'), dbcache, all_book_ids, dbcache.pref('grouped_search_terms'),
self.date_search, self.num_search, self.bool_search, self.date_search, self.num_search, self.bool_search,
self.keypair_search, self.keypair_search,
prefs[ 'limit_search_columns' ], prefs['limit_search_columns'],
prefs[ 'limit_search_columns_to' ], self.all_search_locations, prefs['limit_search_columns_to'], self.all_search_locations,
virtual_fields) virtual_fields)
try: try:

View File

@ -82,7 +82,7 @@ class OneToOneTable(Table):
self.metadata['column'], self.metadata['table'])): self.metadata['column'], self.metadata['table'])):
self.book_col_map[row[0]] = self.unserialize(row[1]) self.book_col_map[row[0]] = self.unserialize(row[1])
class PathTable(OneToOneTable): class PathTable(OneToOneTable):
def set_path(self, book_id, path, db): def set_path(self, book_id, path, db):
self.book_col_map[book_id] = path self.book_col_map[book_id] = path

View File

@ -9,15 +9,32 @@ __docformat__ = 'restructuredtext en'
import unittest, os, argparse import unittest, os, argparse
try:
import init_calibre # noqa
except ImportError:
pass
def find_tests(): def find_tests():
return unittest.defaultTestLoader.discover(os.path.dirname(os.path.abspath(__file__)), pattern='*.py') return unittest.defaultTestLoader.discover(os.path.dirname(os.path.abspath(__file__)), pattern='*.py')
if __name__ == '__main__': if __name__ == '__main__':
parser = argparse.ArgumentParser() parser = argparse.ArgumentParser()
parser.add_argument('name', nargs='?', default=None, help='The name of the test to run, for e.g. writing.WritingTest.many_many_basic') parser.add_argument('name', nargs='?', default=None,
help='The name of the test to run, for e.g. writing.WritingTest.many_many_basic or .many_many_basic for a shortcut')
args = parser.parse_args() args = parser.parse_args()
if args.name: if args.name and args.name.startswith('.'):
unittest.TextTestRunner(verbosity=4).run(unittest.defaultTestLoader.loadTestsFromName(args.name)) tests = find_tests()
ans = None
try:
for suite in tests:
for test in suite._tests:
for s in test:
if s._testMethodName == args.name[1:]:
tests = s
raise StopIteration()
except StopIteration:
pass
else: else:
unittest.TextTestRunner(verbosity=4).run(find_tests()) tests = unittest.defaultTestLoader.loadTestsFromName(args.name) if args.name else find_tests()
unittest.TextTestRunner(verbosity=4).run(tests)

View File

@ -8,6 +8,7 @@ __copyright__ = '2011, Kovid Goyal <kovid@kovidgoyal.net>'
__docformat__ = 'restructuredtext en' __docformat__ = 'restructuredtext en'
import datetime import datetime
from io import BytesIO
from calibre.utils.date import utc_tz from calibre.utils.date import utc_tz
from calibre.db.tests.base import BaseTest from calibre.db.tests.base import BaseTest
@ -205,6 +206,9 @@ class ReadingTest(BaseTest):
else: else:
self.assertEqual(cdata, cache.cover(book_id, as_path=True), self.assertEqual(cdata, cache.cover(book_id, as_path=True),
'Reading of null cover as path failed') 'Reading of null cover as path failed')
buf = BytesIO()
self.assertFalse(cache.copy_cover_to(99999, buf), 'copy_cover_to() did not return False for non-existent book_id')
self.assertFalse(cache.copy_cover_to(3, buf), 'copy_cover_to() did not return False for non-existent cover')
# }}} # }}}
@ -305,6 +309,7 @@ class ReadingTest(BaseTest):
def test_get_formats(self): # {{{ def test_get_formats(self): # {{{
'Test reading ebook formats using the format() method' 'Test reading ebook formats using the format() method'
from calibre.library.database2 import LibraryDatabase2 from calibre.library.database2 import LibraryDatabase2
from calibre.db.cache import NoSuchFormat
old = LibraryDatabase2(self.library_path) old = LibraryDatabase2(self.library_path)
ids = old.all_ids() ids = old.all_ids()
lf = {i:set(old.formats(i, index_is_id=True).split(',')) if old.formats( lf = {i:set(old.formats(i, index_is_id=True).split(',')) if old.formats(
@ -332,6 +337,9 @@ class ReadingTest(BaseTest):
self.assertEqual(old, f.read(), self.assertEqual(old, f.read(),
'Failed to read format as path') 'Failed to read format as path')
buf = BytesIO()
self.assertRaises(NoSuchFormat, cache.copy_format_to, 99999, 'X', buf, 'copy_format_to() failed to raise an exception for non-existent book')
self.assertRaises(NoSuchFormat, cache.copy_format_to, 1, 'X', buf, 'copy_format_to() failed to raise an exception for non-existent format')
# }}} # }}}

View File

@ -9,6 +9,7 @@ __docformat__ = 'restructuredtext en'
from collections import namedtuple from collections import namedtuple
from functools import partial from functools import partial
from io import BytesIO
from calibre.ebooks.metadata import author_to_author_sort from calibre.ebooks.metadata import author_to_author_sort
from calibre.utils.date import UNDEFINED_DATE from calibre.utils.date import UNDEFINED_DATE
@ -16,6 +17,7 @@ from calibre.db.tests.base import BaseTest
class WritingTest(BaseTest): class WritingTest(BaseTest):
# Utils {{{
def create_getter(self, name, getter=None): def create_getter(self, name, getter=None):
if getter is None: if getter is None:
if name.endswith('_index'): if name.endswith('_index'):
@ -35,7 +37,7 @@ class WritingTest(BaseTest):
ans = lambda db:partial(getattr(db, setter), commit=True) ans = lambda db:partial(getattr(db, setter), commit=True)
return ans return ans
def create_test(self, name, vals, getter=None, setter=None ): def create_test(self, name, vals, getter=None, setter=None):
T = namedtuple('Test', 'name vals getter setter') T = namedtuple('Test', 'name vals getter setter')
return T(name, vals, self.create_getter(name, getter), return T(name, vals, self.create_getter(name, getter),
self.create_setter(name, setter)) self.create_setter(name, setter))
@ -70,8 +72,9 @@ class WritingTest(BaseTest):
'Failed setting for %s, sqlite value not the same: %r != %r'%( 'Failed setting for %s, sqlite value not the same: %r != %r'%(
test.name, old_sqlite_res, sqlite_res)) test.name, old_sqlite_res, sqlite_res))
del db del db
# }}}
def test_one_one(self): # {{{ def test_one_one(self): # {{{
'Test setting of values in one-one fields' 'Test setting of values in one-one fields'
tests = [self.create_test('#yesno', (True, False, 'true', 'false', None))] tests = [self.create_test('#yesno', (True, False, 'true', 'false', None))]
for name, getter, setter in ( for name, getter, setter in (
@ -112,7 +115,7 @@ class WritingTest(BaseTest):
self.run_tests(tests) self.run_tests(tests)
# }}} # }}}
def test_many_one_basic(self): # {{{ def test_many_one_basic(self): # {{{
'Test the different code paths for writing to a many-one field' 'Test the different code paths for writing to a many-one field'
cl = self.cloned_library cl = self.cloned_library
cache = self.init_cache(cl) cache = self.init_cache(cl)
@ -199,7 +202,7 @@ class WritingTest(BaseTest):
# }}} # }}}
def test_many_many_basic(self): # {{{ def test_many_many_basic(self): # {{{
'Test the different code paths for writing to a many-many field' 'Test the different code paths for writing to a many-many field'
cl = self.cloned_library cl = self.cloned_library
cache = self.init_cache(cl) cache = self.init_cache(cl)
@ -289,6 +292,67 @@ class WritingTest(BaseTest):
ae(c.field_for('sort', 1), 'Moose, The') ae(c.field_for('sort', 1), 'Moose, The')
ae(c.field_for('sort', 2), 'Cat') ae(c.field_for('sort', 2), 'Cat')
# }}} # }}}
def test_dirtied(self): # {{{
'Test the setting of the dirtied flag and the last_modified column'
cl = self.cloned_library
cache = self.init_cache(cl)
ae, af, sf = self.assertEqual, self.assertFalse, cache.set_field
# First empty dirtied
cache.dump_metadata()
af(cache.dirtied_cache)
af(self.init_cache(cl).dirtied_cache)
prev = cache.field_for('last_modified', 3)
import calibre.db.cache as c
from datetime import timedelta
utime = prev+timedelta(days=1)
onowf = c.nowf
c.nowf = lambda: utime
try:
ae(sf('title', {3:'xxx'}), set([3]))
self.assertTrue(3 in cache.dirtied_cache)
ae(cache.field_for('last_modified', 3), utime)
cache.dump_metadata()
raw = cache.read_backup(3)
from calibre.ebooks.metadata.opf2 import OPF
opf = OPF(BytesIO(raw))
ae(opf.title, 'xxx')
finally:
c.nowf = onowf
# }}}
def test_backup(self): # {{{
'Test the automatic backup of changed metadata'
cl = self.cloned_library
cache = self.init_cache(cl)
ae, af, sf, ff = self.assertEqual, self.assertFalse, cache.set_field, cache.field_for
# First empty dirtied
cache.dump_metadata()
af(cache.dirtied_cache)
from calibre.db.backup import MetadataBackup
interval = 0.01
mb = MetadataBackup(cache, interval=interval, scheduling_interval=0)
mb.start()
try:
ae(sf('title', {1:'title1', 2:'title2', 3:'title3'}), {1,2,3})
ae(sf('authors', {1:'author1 & author2', 2:'author1 & author2', 3:'author1 & author2'}), {1,2,3})
count = 6
while cache.dirty_queue_length() and count > 0:
mb.join(interval)
count -= 1
af(cache.dirty_queue_length())
finally:
mb.stop()
mb.join(interval)
af(mb.is_alive())
from calibre.ebooks.metadata.opf2 import OPF
for book_id in (1, 2, 3):
raw = cache.read_backup(book_id)
opf = OPF(BytesIO(raw))
ae(opf.title, 'title%d'%book_id)
ae(opf.authors, ['author1', 'author2'])
# }}}

View File

@ -60,10 +60,10 @@ class View(object):
else: else:
try: try:
self._field_getters[idx] = { self._field_getters[idx] = {
'id' : self._get_id, 'id': self._get_id,
'au_map' : self.get_author_data, 'au_map': self.get_author_data,
'ondevice': self.get_ondevice, 'ondevice': self.get_ondevice,
'marked' : self.get_marked, 'marked': self.get_marked,
}[col] }[col]
except KeyError: except KeyError:
self._field_getters[idx] = partial(self.get, col) self._field_getters[idx] = partial(self.get, col)

View File

@ -417,7 +417,7 @@ def many_many(book_id_val_map, db, field, allow_case_change, *args):
# }}} # }}}
def identifiers(book_id_val_map, db, field, *args): # {{{ def identifiers(book_id_val_map, db, field, *args): # {{{
table = field.table table = field.table
updates = set() updates = set()
for book_id, identifiers in book_id_val_map.iteritems(): for book_id, identifiers in book_id_val_map.iteritems():

View File

@ -97,6 +97,12 @@ class TXTInput(InputFormatPlugin):
if not ienc: if not ienc:
ienc = 'utf-8' ienc = 'utf-8'
log.debug('No input encoding specified and could not auto detect using %s' % ienc) log.debug('No input encoding specified and could not auto detect using %s' % ienc)
# Remove BOM from start of txt as its presence can confuse markdown
import codecs
for bom in (codecs.BOM_UTF16_LE, codecs.BOM_UTF16_BE, codecs.BOM_UTF8, codecs.BOM_UTF32_LE, codecs.BOM_UTF32_BE):
if txt.startswith(bom):
txt = txt[len(bom):]
break
txt = txt.decode(ienc, 'replace') txt = txt.decode(ienc, 'replace')
# Replace entities # Replace entities

View File

@ -24,7 +24,7 @@ from calibre import prints, guess_type
from calibre.utils.cleantext import clean_ascii_chars from calibre.utils.cleantext import clean_ascii_chars
from calibre.utils.config import tweaks from calibre.utils.config import tweaks
class Resource(object): # {{{ class Resource(object): # {{{
''' '''
Represents a resource (usually a file on the filesystem or a URL pointing Represents a resource (usually a file on the filesystem or a URL pointing
to the web. Such resources are commonly referred to in OPF files. to the web. Such resources are commonly referred to in OPF files.
@ -68,7 +68,6 @@ class Resource(object): # {{{
self.path = os.path.abspath(os.path.join(basedir, pc.replace('/', os.sep))) self.path = os.path.abspath(os.path.join(basedir, pc.replace('/', os.sep)))
self.fragment = url[-1] self.fragment = url[-1]
def href(self, basedir=None): def href(self, basedir=None):
''' '''
Return a URL pointing to this resource. If it is a file on the filesystem Return a URL pointing to this resource. If it is a file on the filesystem
@ -90,7 +89,7 @@ class Resource(object): # {{{
return ''+frag return ''+frag
try: try:
rpath = os.path.relpath(self.path, basedir) rpath = os.path.relpath(self.path, basedir)
except ValueError: # On windows path and basedir could be on different drives except ValueError: # On windows path and basedir could be on different drives
rpath = self.path rpath = self.path
if isinstance(rpath, unicode): if isinstance(rpath, unicode):
rpath = rpath.encode('utf-8') rpath = rpath.encode('utf-8')
@ -107,7 +106,7 @@ class Resource(object): # {{{
# }}} # }}}
class ResourceCollection(object): # {{{ class ResourceCollection(object): # {{{
def __init__(self): def __init__(self):
self._resources = [] self._resources = []
@ -160,7 +159,7 @@ class ResourceCollection(object): # {{{
# }}} # }}}
class ManifestItem(Resource): # {{{ class ManifestItem(Resource): # {{{
@staticmethod @staticmethod
def from_opf_manifest_item(item, basedir): def from_opf_manifest_item(item, basedir):
@ -180,7 +179,6 @@ class ManifestItem(Resource): # {{{
self.mime_type = val self.mime_type = val
return property(fget=fget, fset=fset) return property(fget=fget, fset=fset)
def __unicode__(self): def __unicode__(self):
return u'<item id="%s" href="%s" media-type="%s" />'%(self.id, self.href(), self.media_type) return u'<item id="%s" href="%s" media-type="%s" />'%(self.id, self.href(), self.media_type)
@ -190,7 +188,6 @@ class ManifestItem(Resource): # {{{
def __repr__(self): def __repr__(self):
return unicode(self) return unicode(self)
def __getitem__(self, index): def __getitem__(self, index):
if index == 0: if index == 0:
return self.href() return self.href()
@ -200,7 +197,7 @@ class ManifestItem(Resource): # {{{
# }}} # }}}
class Manifest(ResourceCollection): # {{{ class Manifest(ResourceCollection): # {{{
@staticmethod @staticmethod
def from_opf_manifest_element(items, dir): def from_opf_manifest_element(items, dir):
@ -245,7 +242,6 @@ class Manifest(ResourceCollection): # {{{
ResourceCollection.__init__(self) ResourceCollection.__init__(self)
self.next_id = 1 self.next_id = 1
def item(self, id): def item(self, id):
for i in self: for i in self:
if i.id == id: if i.id == id:
@ -269,7 +265,7 @@ class Manifest(ResourceCollection): # {{{
# }}} # }}}
class Spine(ResourceCollection): # {{{ class Spine(ResourceCollection): # {{{
class Item(Resource): class Item(Resource):
@ -309,13 +305,10 @@ class Spine(ResourceCollection): # {{{
continue continue
return s return s
def __init__(self, manifest): def __init__(self, manifest):
ResourceCollection.__init__(self) ResourceCollection.__init__(self)
self.manifest = manifest self.manifest = manifest
def replace(self, start, end, ids): def replace(self, start, end, ids):
''' '''
Replace the items between start (inclusive) and end (not inclusive) with Replace the items between start (inclusive) and end (not inclusive) with
@ -345,7 +338,7 @@ class Spine(ResourceCollection): # {{{
# }}} # }}}
class Guide(ResourceCollection): # {{{ class Guide(ResourceCollection): # {{{
class Reference(Resource): class Reference(Resource):
@ -363,7 +356,6 @@ class Guide(ResourceCollection): # {{{
ans += 'title="%s" '%self.title ans += 'title="%s" '%self.title
return ans + '/>' return ans + '/>'
@staticmethod @staticmethod
def from_opf_guide(references, base_dir=os.getcwdu()): def from_opf_guide(references, base_dir=os.getcwdu()):
coll = Guide() coll = Guide()
@ -484,14 +476,14 @@ def dump_dict(cats):
return json.dumps(object_to_unicode(cats), ensure_ascii=False, return json.dumps(object_to_unicode(cats), ensure_ascii=False,
skipkeys=True) skipkeys=True)
class OPF(object): # {{{ class OPF(object): # {{{
MIMETYPE = 'application/oebps-package+xml' MIMETYPE = 'application/oebps-package+xml'
PARSER = etree.XMLParser(recover=True) PARSER = etree.XMLParser(recover=True)
NAMESPACES = { NAMESPACES = {
None : "http://www.idpf.org/2007/opf", None: "http://www.idpf.org/2007/opf",
'dc' : "http://purl.org/dc/elements/1.1/", 'dc': "http://purl.org/dc/elements/1.1/",
'opf' : "http://www.idpf.org/2007/opf", 'opf': "http://www.idpf.org/2007/opf",
} }
META = '{%s}meta' % NAMESPACES['opf'] META = '{%s}meta' % NAMESPACES['opf']
xpn = NAMESPACES.copy() xpn = NAMESPACES.copy()
@ -501,9 +493,10 @@ class OPF(object): # {{{
CONTENT = XPath('self::*[re:match(name(), "meta$", "i")]/@content') CONTENT = XPath('self::*[re:match(name(), "meta$", "i")]/@content')
TEXT = XPath('string()') TEXT = XPath('string()')
metadata_path = XPath('descendant::*[re:match(name(), "metadata", "i")]') metadata_path = XPath('descendant::*[re:match(name(), "metadata", "i")]')
metadata_elem_path = XPath('descendant::*[re:match(name(), concat($name, "$"), "i") or (re:match(name(), "meta$", "i") and re:match(@name, concat("^calibre:", $name, "$"), "i"))]') metadata_elem_path = XPath(
'descendant::*[re:match(name(), concat($name, "$"), "i") or (re:match(name(), "meta$", "i") '
'and re:match(@name, concat("^calibre:", $name, "$"), "i"))]')
title_path = XPath('descendant::*[re:match(name(), "title", "i")]') title_path = XPath('descendant::*[re:match(name(), "title", "i")]')
authors_path = XPath('descendant::*[re:match(name(), "creator", "i") and (@role="aut" or @opf:role="aut" or (not(@role) and not(@opf:role)))]') authors_path = XPath('descendant::*[re:match(name(), "creator", "i") and (@role="aut" or @opf:role="aut" or (not(@role) and not(@opf:role)))]')
bkp_path = XPath('descendant::*[re:match(name(), "contributor", "i") and (@role="bkp" or @opf:role="bkp")]') bkp_path = XPath('descendant::*[re:match(name(), "contributor", "i") and (@role="bkp" or @opf:role="bkp")]')
@ -640,7 +633,8 @@ class OPF(object): # {{{
if 'toc' in item.href().lower(): if 'toc' in item.href().lower():
toc = item.path toc = item.path
if toc is None: return if toc is None:
return
self.toc = TOC(base_path=self.base_dir) self.toc = TOC(base_path=self.base_dir)
is_ncx = getattr(self, 'manifest', None) is not None and \ is_ncx = getattr(self, 'manifest', None) is not None and \
self.manifest.type_for_id(toc) is not None and \ self.manifest.type_for_id(toc) is not None and \
@ -976,7 +970,6 @@ class OPF(object): # {{{
return property(fget=fget, fset=fset) return property(fget=fget, fset=fset)
@dynamic_property @dynamic_property
def language(self): def language(self):
@ -990,7 +983,6 @@ class OPF(object): # {{{
return property(fget=fget, fset=fset) return property(fget=fget, fset=fset)
@dynamic_property @dynamic_property
def languages(self): def languages(self):
@ -1015,7 +1007,6 @@ class OPF(object): # {{{
return property(fget=fget, fset=fset) return property(fget=fget, fset=fset)
@dynamic_property @dynamic_property
def book_producer(self): def book_producer(self):
@ -1196,7 +1187,6 @@ class OPFCreator(Metadata):
if self.cover: if self.cover:
self.guide.set_cover(self.cover) self.guide.set_cover(self.cover)
def create_manifest(self, entries): def create_manifest(self, entries):
''' '''
Create <manifest> Create <manifest>
@ -1615,9 +1605,9 @@ def test_user_metadata():
from cStringIO import StringIO from cStringIO import StringIO
mi = Metadata('Test title', ['test author1', 'test author2']) mi = Metadata('Test title', ['test author1', 'test author2'])
um = { um = {
'#myseries': { '#value#': u'test series\xe4', 'datatype':'text', '#myseries': {'#value#': u'test series\xe4', 'datatype':'text',
'is_multiple': None, 'name': u'My Series'}, 'is_multiple': None, 'name': u'My Series'},
'#myseries_index': { '#value#': 2.45, 'datatype': 'float', '#myseries_index': {'#value#': 2.45, 'datatype': 'float',
'is_multiple': None}, 'is_multiple': None},
'#mytags': {'#value#':['t1','t2','t3'], 'datatype':'text', '#mytags': {'#value#':['t1','t2','t3'], 'datatype':'text',
'is_multiple': '|', 'name': u'My Tags'} 'is_multiple': '|', 'name': u'My Tags'}

View File

@ -21,7 +21,7 @@ from calibre.ebooks.metadata.book.base import Metadata
from calibre.utils.date import parse_only_date from calibre.utils.date import parse_only_date
from calibre.utils.localization import canonicalize_lang from calibre.utils.localization import canonicalize_lang
class Worker(Thread): # Get details {{{ class Worker(Thread): # Get details {{{
''' '''
Get book details from amazons book page in a separate thread Get book details from amazons book page in a separate thread
@ -43,12 +43,12 @@ class Worker(Thread): # Get details {{{
months = { months = {
'de': { 'de': {
1 : ['jän'], 1: ['jän'],
2 : ['februar'], 2: ['februar'],
3 : ['märz'], 3: ['märz'],
5 : ['mai'], 5: ['mai'],
6 : ['juni'], 6: ['juni'],
7 : ['juli'], 7: ['juli'],
10: ['okt'], 10: ['okt'],
12: ['dez'] 12: ['dez']
}, },
@ -132,7 +132,7 @@ class Worker(Thread): # Get details {{{
text()="Détails sur le produit" or \ text()="Détails sur le produit" or \
text()="Detalles del producto" or \ text()="Detalles del producto" or \
text()="Detalhes do produto" or \ text()="Detalhes do produto" or \
text()="登録情報"]/../div[@class="content"] starts-with(text(), "登録情報")]/../div[@class="content"]
''' '''
# Editor: is for Spanish # Editor: is for Spanish
self.publisher_xpath = ''' self.publisher_xpath = '''
@ -235,6 +235,12 @@ class Worker(Thread): # Get details {{{
msg = 'Failed to parse amazon details page: %r'%self.url msg = 'Failed to parse amazon details page: %r'%self.url
self.log.exception(msg) self.log.exception(msg)
return return
if self.domain == 'jp':
for a in root.xpath('//a[@href]'):
if 'black-curtain-redirect.html' in a.get('href'):
self.url = 'http://amazon.co.jp'+a.get('href')
self.log('Black curtain redirect found, following')
return self.get_details()
errmsg = root.xpath('//*[@id="errorMessage"]') errmsg = root.xpath('//*[@id="errorMessage"]')
if errmsg: if errmsg:
@ -252,8 +258,8 @@ class Worker(Thread): # Get details {{{
self.log.exception('Error parsing asin for url: %r'%self.url) self.log.exception('Error parsing asin for url: %r'%self.url)
asin = None asin = None
if self.testing: if self.testing:
import tempfile import tempfile, uuid
with tempfile.NamedTemporaryFile(prefix=asin + '_', with tempfile.NamedTemporaryFile(prefix=(asin or str(uuid.uuid4()))+ '_',
suffix='.html', delete=False) as f: suffix='.html', delete=False) as f:
f.write(raw) f.write(raw)
print ('Downloaded html for', asin, 'saved in', f.name) print ('Downloaded html for', asin, 'saved in', f.name)
@ -270,7 +276,6 @@ class Worker(Thread): # Get details {{{
self.log.exception('Error parsing authors for url: %r'%self.url) self.log.exception('Error parsing authors for url: %r'%self.url)
authors = [] authors = []
if not title or not authors or not asin: if not title or not authors or not asin:
self.log.error('Could not find title/authors/asin for %r'%self.url) self.log.error('Could not find title/authors/asin for %r'%self.url)
self.log.error('ASIN: %r Title: %r Authors: %r'%(asin, title, self.log.error('ASIN: %r Title: %r Authors: %r'%(asin, title,
@ -425,7 +430,6 @@ class Worker(Thread): # Get details {{{
desc = re.sub(r'(?s)<!--.*?-->', '', desc) desc = re.sub(r'(?s)<!--.*?-->', '', desc)
return sanitize_comments_html(desc) return sanitize_comments_html(desc)
def parse_comments(self, root): def parse_comments(self, root):
ans = '' ans = ''
desc = root.xpath('//div[@id="ps-content"]/div[@class="content"]') desc = root.xpath('//div[@id="ps-content"]/div[@class="content"]')
@ -499,7 +503,7 @@ class Worker(Thread): # Get details {{{
def parse_language(self, pd): def parse_language(self, pd):
for x in reversed(pd.xpath(self.language_xpath)): for x in reversed(pd.xpath(self.language_xpath)):
if x.tail: if x.tail:
raw = x.tail.strip() raw = x.tail.strip().partition(',')[0].strip()
ans = self.lang_map.get(raw, None) ans = self.lang_map.get(raw, None)
if ans: if ans:
return ans return ans
@ -522,13 +526,13 @@ class Amazon(Source):
AMAZON_DOMAINS = { AMAZON_DOMAINS = {
'com': _('US'), 'com': _('US'),
'fr' : _('France'), 'fr': _('France'),
'de' : _('Germany'), 'de': _('Germany'),
'uk' : _('UK'), 'uk': _('UK'),
'it' : _('Italy'), 'it': _('Italy'),
'jp' : _('Japan'), 'jp': _('Japan'),
'es' : _('Spain'), 'es': _('Spain'),
'br' : _('Brazil'), 'br': _('Brazil'),
} }
options = ( options = (
@ -586,7 +590,7 @@ class Amazon(Source):
return domain, val return domain, val
return None, None return None, None
def get_book_url(self, identifiers): # {{{ def get_book_url(self, identifiers): # {{{
domain, asin = self.get_domain_and_asin(identifiers) domain, asin = self.get_domain_and_asin(identifiers)
if domain and asin: if domain and asin:
url = None url = None
@ -631,8 +635,7 @@ class Amazon(Source):
mi.tags = list(map(fixcase, mi.tags)) mi.tags = list(map(fixcase, mi.tags))
mi.isbn = check_isbn(mi.isbn) mi.isbn = check_isbn(mi.isbn)
def create_query(self, log, title=None, authors=None, identifiers={}, # {{{
def create_query(self, log, title=None, authors=None, identifiers={}, # {{{
domain=None): domain=None):
if domain is None: if domain is None:
domain = self.domain domain = self.domain
@ -642,8 +645,8 @@ class Amazon(Source):
domain = idomain domain = idomain
# See the amazon detailed search page to get all options # See the amazon detailed search page to get all options
q = { 'search-alias' : 'aps', q = {'search-alias': 'aps',
'unfiltered' : '1', 'unfiltered': '1',
} }
if domain == 'com': if domain == 'com':
@ -698,7 +701,7 @@ class Amazon(Source):
# }}} # }}}
def get_cached_cover_url(self, identifiers): # {{{ def get_cached_cover_url(self, identifiers): # {{{
url = None url = None
domain, asin = self.get_domain_and_asin(identifiers) domain, asin = self.get_domain_and_asin(identifiers)
if asin is None: if asin is None:
@ -711,14 +714,17 @@ class Amazon(Source):
return url return url
# }}} # }}}
def parse_results_page(self, root): # {{{ def parse_results_page(self, root): # {{{
from lxml.html import tostring from lxml.html import tostring
matches = [] matches = []
def title_ok(title): def title_ok(title):
title = title.lower() title = title.lower()
for x in ('bulk pack', '[audiobook]', '[audio cd]'): bad = ['bulk pack', '[audiobook]', '[audio cd]']
if self.domain == 'com':
bad.append('(spanish edition)')
for x in bad:
if x in title: if x in title:
return False return False
return True return True
@ -745,13 +751,12 @@ class Amazon(Source):
matches.append(a.get('href')) matches.append(a.get('href'))
break break
# Keep only the top 5 matches as the matches are sorted by relevance by # Keep only the top 5 matches as the matches are sorted by relevance by
# Amazon so lower matches are not likely to be very relevant # Amazon so lower matches are not likely to be very relevant
return matches[:5] return matches[:5]
# }}} # }}}
def identify(self, log, result_queue, abort, title=None, authors=None, # {{{ def identify(self, log, result_queue, abort, title=None, authors=None, # {{{
identifiers={}, timeout=30): identifiers={}, timeout=30):
''' '''
Note this method will retry without identifiers automatically if no Note this method will retry without identifiers automatically if no
@ -789,7 +794,6 @@ class Amazon(Source):
log.exception(msg) log.exception(msg)
return as_unicode(msg) return as_unicode(msg)
raw = clean_ascii_chars(xml_to_unicode(raw, raw = clean_ascii_chars(xml_to_unicode(raw,
strip_encoding_pats=True, resolve_entities=True)[0]) strip_encoding_pats=True, resolve_entities=True)[0])
@ -819,7 +823,6 @@ class Amazon(Source):
# The error is almost always a not found error # The error is almost always a not found error
found = False found = False
if found: if found:
matches = self.parse_results_page(root) matches = self.parse_results_page(root)
@ -857,7 +860,7 @@ class Amazon(Source):
return None return None
# }}} # }}}
def download_cover(self, log, result_queue, abort, # {{{ def download_cover(self, log, result_queue, abort, # {{{
title=None, authors=None, identifiers={}, timeout=30, get_best_cover=False): title=None, authors=None, identifiers={}, timeout=30, get_best_cover=False):
cached_url = self.get_cached_cover_url(identifiers) cached_url = self.get_cached_cover_url(identifiers)
if cached_url is None: if cached_url is None:
@ -894,39 +897,44 @@ class Amazon(Source):
log.exception('Failed to download cover from:', cached_url) log.exception('Failed to download cover from:', cached_url)
# }}} # }}}
if __name__ == '__main__': # tests {{{ if __name__ == '__main__': # tests {{{
# To run these test use: calibre-debug -e # To run these test use: calibre-debug -e
# src/calibre/ebooks/metadata/sources/amazon.py # src/calibre/ebooks/metadata/sources/amazon.py
from calibre.ebooks.metadata.sources.test import (test_identify_plugin, from calibre.ebooks.metadata.sources.test import (test_identify_plugin,
isbn_test, title_test, authors_test, comments_test, series_test) isbn_test, title_test, authors_test, comments_test, series_test)
com_tests = [ # {{{ com_tests = [ # {{{
( # + in title and uses id="main-image" for cover ( # Has a spanish edition
{'title':'11/22/63'},
[title_test('11/22/63: A Novel', exact=True), authors_test(['Stephen King']),]
),
( # + in title and uses id="main-image" for cover
{'title':'C++ Concurrency in Action'}, {'title':'C++ Concurrency in Action'},
[title_test('C++ Concurrency in Action: Practical Multithreading', [title_test('C++ Concurrency in Action: Practical Multithreading',
exact=True), exact=True),
] ]
), ),
( # Series ( # Series
{'identifiers':{'amazon':'0756407117'}}, {'identifiers':{'amazon':'0756407117'}},
[title_test( [title_test(
"Throne of the Crescent Moon" "Throne of the Crescent Moon",
, exact=True), series_test('Crescent Moon Kingdoms', 1), exact=True), series_test('Crescent Moon Kingdoms', 1),
comments_test('Makhslood'), comments_test('Makhslood'),
] ]
), ),
( # Different comments markup, using Book Description section ( # Different comments markup, using Book Description section
{'identifiers':{'amazon':'0982514506'}}, {'identifiers':{'amazon':'0982514506'}},
[title_test( [title_test(
"Griffin's Destiny: Book Three: The Griffin's Daughter Trilogy" "Griffin's Destiny: Book Three: The Griffin's Daughter Trilogy",
, exact=True), exact=True),
comments_test('Jelena'), comments_test('Leslie'), comments_test('Jelena'), comments_test('Leslie'),
] ]
), ),
( # # in title ( # # in title
{'title':'Expert C# 2008 Business Objects', {'title':'Expert C# 2008 Business Objects',
'authors':['Lhotka']}, 'authors':['Lhotka']},
[title_test('Expert C# 2008 Business Objects', exact=True), [title_test('Expert C# 2008 Business Objects', exact=True),
@ -942,13 +950,13 @@ if __name__ == '__main__': # tests {{{
), ),
( # Sophisticated comment formatting ( # Sophisticated comment formatting
{'identifiers':{'isbn': '9781416580829'}}, {'identifiers':{'isbn': '9781416580829'}},
[title_test('Angels & Demons - Movie Tie-In: A Novel', [title_test('Angels & Demons - Movie Tie-In: A Novel',
exact=True), authors_test(['Dan Brown'])] exact=True), authors_test(['Dan Brown'])]
), ),
( # No specific problems ( # No specific problems
{'identifiers':{'isbn': '0743273567'}}, {'identifiers':{'isbn': '0743273567'}},
[title_test('The great gatsby', exact=True), [title_test('The great gatsby', exact=True),
authors_test(['F. Scott Fitzgerald'])] authors_test(['F. Scott Fitzgerald'])]
@ -961,9 +969,9 @@ if __name__ == '__main__': # tests {{{
), ),
] # }}} ] # }}}
de_tests = [ # {{{ de_tests = [ # {{{
( (
{'identifiers':{'isbn': '3548283519'}}, {'identifiers':{'isbn': '3548283519'}},
[title_test('Wer Wind Sät: Der Fünfte Fall Für Bodenstein Und Kirchhoff', [title_test('Wer Wind Sät: Der Fünfte Fall Für Bodenstein Und Kirchhoff',
@ -971,9 +979,9 @@ if __name__ == '__main__': # tests {{{
] ]
), ),
] # }}} ] # }}}
it_tests = [ # {{{ it_tests = [ # {{{
( (
{'identifiers':{'isbn': '8838922195'}}, {'identifiers':{'isbn': '8838922195'}},
[title_test('La briscola in cinque', [title_test('La briscola in cinque',
@ -981,9 +989,9 @@ if __name__ == '__main__': # tests {{{
] ]
), ),
] # }}} ] # }}}
fr_tests = [ # {{{ fr_tests = [ # {{{
( (
{'identifiers':{'isbn': '2221116798'}}, {'identifiers':{'isbn': '2221116798'}},
[title_test('L\'étrange voyage de Monsieur Daldry', [title_test('L\'étrange voyage de Monsieur Daldry',
@ -991,9 +999,9 @@ if __name__ == '__main__': # tests {{{
] ]
), ),
] # }}} ] # }}}
es_tests = [ # {{{ es_tests = [ # {{{
( (
{'identifiers':{'isbn': '8483460831'}}, {'identifiers':{'isbn': '8483460831'}},
[title_test('Tiempos Interesantes', [title_test('Tiempos Interesantes',
@ -1001,23 +1009,28 @@ if __name__ == '__main__': # tests {{{
] ]
), ),
] # }}} ] # }}}
jp_tests = [ # {{{ jp_tests = [ # {{{
( # isbn -> title, authors ( # Adult filtering test
{'identifiers':{'isbn': '9784101302720' }}, {'identifiers':{'isbn':'4799500066'}},
[title_test(u' '),]
),
( # isbn -> title, authors
{'identifiers':{'isbn': '9784101302720'}},
[title_test(u'精霊の守り人', [title_test(u'精霊の守り人',
exact=True), authors_test([u'上橋 菜穂子']) exact=True), authors_test([u'上橋 菜穂子'])
] ]
), ),
( # title, authors -> isbn (will use Shift_JIS encoding in query.) ( # title, authors -> isbn (will use Shift_JIS encoding in query.)
{'title': u'考えない練習', {'title': u'考えない練習',
'authors': [u'小池 龍之介']}, 'authors': [u'小池 龍之介']},
[isbn_test('9784093881067'), ] [isbn_test('9784093881067'), ]
), ),
] # }}} ] # }}}
br_tests = [ # {{{ br_tests = [ # {{{
( (
{'title':'Guerra dos Tronos'}, {'title':'Guerra dos Tronos'},
[title_test('A Guerra dos Tronos - As Crônicas de Gelo e Fogo', [title_test('A Guerra dos Tronos - As Crônicas de Gelo e Fogo',
@ -1025,7 +1038,7 @@ if __name__ == '__main__': # tests {{{
] ]
), ),
] # }}} ] # }}}
def do_test(domain, start=0, stop=None): def do_test(domain, start=0, stop=None):
tests = globals().get(domain+'_tests') tests = globals().get(domain+'_tests')

View File

@ -31,7 +31,7 @@ msprefs.defaults['find_first_edition_date'] = False
# Google covers are often poor quality (scans/errors) but they have high # Google covers are often poor quality (scans/errors) but they have high
# resolution, so they trump covers from better sources. So make sure they # resolution, so they trump covers from better sources. So make sure they
# are only used if no other covers are found. # are only used if no other covers are found.
msprefs.defaults['cover_priorities'] = {'Google':2, 'Google Images':2} msprefs.defaults['cover_priorities'] = {'Google':2, 'Google Images':2, 'Big Book Search':2}
def create_log(ostream=None): def create_log(ostream=None):
from calibre.utils.logging import ThreadSafeLog, FileStream from calibre.utils.logging import ThreadSafeLog, FileStream
@ -429,6 +429,40 @@ class Source(Plugin):
mi.tags = list(map(fixcase, mi.tags)) mi.tags = list(map(fixcase, mi.tags))
mi.isbn = check_isbn(mi.isbn) mi.isbn = check_isbn(mi.isbn)
def download_multiple_covers(self, title, authors, urls, get_best_cover, timeout, result_queue, abort, log, prefs_name='max_covers'):
if not urls:
log('No images found for, title: %r and authors: %r'%(title, authors))
return
from threading import Thread
import time
if prefs_name:
urls = urls[:self.prefs[prefs_name]]
if get_best_cover:
urls = urls[:1]
log('Downloading %d covers'%len(urls))
workers = [Thread(target=self.download_image, args=(u, timeout, log, result_queue)) for u in urls]
for w in workers:
w.daemon = True
w.start()
alive = True
start_time = time.time()
while alive and not abort.is_set() and time.time() - start_time < timeout:
alive = False
for w in workers:
if w.is_alive():
alive = True
break
abort.wait(0.1)
def download_image(self, url, timeout, log, result_queue):
try:
ans = self.browser.open_novisit(url, timeout=timeout).read()
result_queue.put((self, ans))
log('Downloaded cover from: %s'%url)
except Exception:
self.log.exception('Failed to download cover from: %r'%url)
# }}} # }}}
# Metadata API {{{ # Metadata API {{{

View File

@ -0,0 +1,58 @@
#!/usr/bin/env python
# vim:fileencoding=UTF-8
from __future__ import (unicode_literals, division, absolute_import,
print_function)
__license__ = 'GPL v3'
__copyright__ = '2013, Kovid Goyal <kovid@kovidgoyal.net>'
__docformat__ = 'restructuredtext en'
from calibre.ebooks.metadata.sources.base import Source, Option
def get_urls(br, tokens):
from urllib import quote_plus
from mechanize import Request
from lxml import html
escaped = [quote_plus(x.encode('utf-8')) for x in tokens if x and x.strip()]
q = b'+'.join(escaped)
url = 'http://bigbooksearch.com/books/'+q
br.open(url).read()
req = Request('http://bigbooksearch.com/query.php?SearchIndex=books&Keywords=%s&ItemPage=1'%q)
req.add_header('X-Requested-With', 'XMLHttpRequest')
req.add_header('Referer', url)
raw = br.open(req).read()
root = html.fromstring(raw.decode('utf-8'))
urls = [i.get('src') for i in root.xpath('//img[@src]')]
return urls
class BigBookSearch(Source):
name = 'Big Book Search'
description = _('Downloads multiple book covers from Amazon. Useful to find alternate covers.')
capabilities = frozenset(['cover'])
config_help_message = _('Configure the Big Book Search plugin')
can_get_multiple_covers = True
options = (Option('max_covers', 'number', 5, _('Maximum number of covers to get'),
_('The maximum number of covers to process from the search result')),
)
supports_gzip_transfer_encoding = True
def download_cover(self, log, result_queue, abort,
title=None, authors=None, identifiers={}, timeout=30, get_best_cover=False):
if not title:
return
br = self.browser
tokens = tuple(self.get_title_tokens(title)) + tuple(self.get_author_tokens(authors))
urls = get_urls(br, tokens)
self.download_multiple_covers(title, authors, urls, get_best_cover, timeout, result_queue, abort, log)
def test():
from calibre import browser
import pprint
br = browser()
urls = get_urls(br, ['consider', 'phlebas', 'banks'])
pprint.pprint(urls)
if __name__ == '__main__':
test()

View File

@ -18,12 +18,13 @@ from calibre.utils.magick.draw import Image, save_cover_data_to
class Worker(Thread): class Worker(Thread):
def __init__(self, plugin, abort, title, authors, identifiers, timeout, rq): def __init__(self, plugin, abort, title, authors, identifiers, timeout, rq, get_best_cover=False):
Thread.__init__(self) Thread.__init__(self)
self.daemon = True self.daemon = True
self.plugin = plugin self.plugin = plugin
self.abort = abort self.abort = abort
self.get_best_cover = get_best_cover
self.buf = BytesIO() self.buf = BytesIO()
self.log = create_log(self.buf) self.log = create_log(self.buf)
self.title, self.authors, self.identifiers = (title, authors, self.title, self.authors, self.identifiers = (title, authors,
@ -37,7 +38,7 @@ class Worker(Thread):
try: try:
if self.plugin.can_get_multiple_covers: if self.plugin.can_get_multiple_covers:
self.plugin.download_cover(self.log, self.rq, self.abort, self.plugin.download_cover(self.log, self.rq, self.abort,
title=self.title, authors=self.authors, get_best_cover=True, title=self.title, authors=self.authors, get_best_cover=self.get_best_cover,
identifiers=self.identifiers, timeout=self.timeout) identifiers=self.identifiers, timeout=self.timeout)
else: else:
self.plugin.download_cover(self.log, self.rq, self.abort, self.plugin.download_cover(self.log, self.rq, self.abort,
@ -72,7 +73,7 @@ def process_result(log, result):
return (plugin, width, height, fmt, data) return (plugin, width, height, fmt, data)
def run_download(log, results, abort, def run_download(log, results, abort,
title=None, authors=None, identifiers={}, timeout=30): title=None, authors=None, identifiers={}, timeout=30, get_best_cover=False):
''' '''
Run the cover download, putting results into the queue :param:`results`. Run the cover download, putting results into the queue :param:`results`.
@ -89,7 +90,7 @@ def run_download(log, results, abort,
plugins = [p for p in metadata_plugins(['cover']) if p.is_configured()] plugins = [p for p in metadata_plugins(['cover']) if p.is_configured()]
rq = Queue() rq = Queue()
workers = [Worker(p, abort, title, authors, identifiers, timeout, rq) for p workers = [Worker(p, abort, title, authors, identifiers, timeout, rq, get_best_cover=get_best_cover) for p
in plugins] in plugins]
for w in workers: for w in workers:
w.start() w.start()
@ -163,7 +164,7 @@ def download_cover(log,
abort = Event() abort = Event()
run_download(log, rq, abort, title=title, authors=authors, run_download(log, rq, abort, title=title, authors=authors,
identifiers=identifiers, timeout=timeout) identifiers=identifiers, timeout=timeout, get_best_cover=True)
results = [] results = []

View File

@ -106,6 +106,8 @@ class Worker(Thread): # {{{
parts = pub.partition(':')[0::2] parts = pub.partition(':')[0::2]
pub = parts[1] or parts[0] pub = parts[1] or parts[0]
try: try:
if ', Ship Date:' in pub:
pub = pub.partition(', Ship Date:')[0]
q = parse_only_date(pub, assume_utc=True) q = parse_only_date(pub, assume_utc=True)
if q.year != UNDEFINED_DATE: if q.year != UNDEFINED_DATE:
mi.pubdate = q mi.pubdate = q

View File

@ -39,39 +39,11 @@ class GoogleImages(Source):
title=None, authors=None, identifiers={}, timeout=30, get_best_cover=False): title=None, authors=None, identifiers={}, timeout=30, get_best_cover=False):
if not title: if not title:
return return
from threading import Thread
import time
timeout = max(60, timeout) # Needs at least a minute timeout = max(60, timeout) # Needs at least a minute
title = ' '.join(self.get_title_tokens(title)) title = ' '.join(self.get_title_tokens(title))
author = ' '.join(self.get_author_tokens(authors)) author = ' '.join(self.get_author_tokens(authors))
urls = self.get_image_urls(title, author, log, abort, timeout) urls = self.get_image_urls(title, author, log, abort, timeout)
if not urls: self.download_multiple_covers(title, authors, urls, get_best_cover, timeout, result_queue, abort, log)
log('No images found in Google for, title: %r and authors: %r'%(title, author))
return
urls = urls[:self.prefs['max_covers']]
if get_best_cover:
urls = urls[:1]
workers = [Thread(target=self.download_image, args=(url, timeout, log, result_queue)) for url in urls]
for w in workers:
w.daemon = True
w.start()
alive = True
start_time = time.time()
while alive and not abort.is_set() and time.time() - start_time < timeout:
alive = False
for w in workers:
if w.is_alive():
alive = True
break
abort.wait(0.1)
def download_image(self, url, timeout, log, result_queue):
try:
ans = self.browser.open_novisit(url, timeout=timeout).read()
result_queue.put((self, ans))
log('Downloaded cover from: %s'%url)
except Exception:
self.log.exception('Failed to download cover from: %r'%url)
def get_image_urls(self, title, author, log, abort, timeout): def get_image_urls(self, title, author, log, abort, timeout):
from calibre.utils.ipc.simple_worker import fork_job, WorkerError from calibre.utils.ipc.simple_worker import fork_job, WorkerError

View File

@ -51,9 +51,11 @@ def reverse_tag_iter(block):
end = len(block) end = len(block)
while True: while True:
pgt = block.rfind(b'>', 0, end) pgt = block.rfind(b'>', 0, end)
if pgt == -1: break if pgt == -1:
break
plt = block.rfind(b'<', 0, pgt) plt = block.rfind(b'<', 0, pgt)
if plt == -1: break if plt == -1:
break
yield block[plt:pgt+1] yield block[plt:pgt+1]
end = plt end = plt
@ -231,12 +233,12 @@ class Mobi8Reader(object):
flowpart = self.flows[j] flowpart = self.flows[j]
nstr = '%04d' % j nstr = '%04d' % j
m = svg_tag_pattern.search(flowpart) m = svg_tag_pattern.search(flowpart)
if m != None: if m is not None:
# svg # svg
typ = 'svg' typ = 'svg'
start = m.start() start = m.start()
m2 = image_tag_pattern.search(flowpart) m2 = image_tag_pattern.search(flowpart)
if m2 != None: if m2 is not None:
format = 'inline' format = 'inline'
dir = None dir = None
fname = None fname = None
@ -320,7 +322,7 @@ class Mobi8Reader(object):
if len(pos_fid) != 2: if len(pos_fid) != 2:
continue continue
except TypeError: except TypeError:
continue # thumbnailstandard record, ignore it continue # thumbnailstandard record, ignore it
linktgt, idtext = self.get_id_tag_by_pos_fid(*pos_fid) linktgt, idtext = self.get_id_tag_by_pos_fid(*pos_fid)
if idtext: if idtext:
linktgt += b'#' + idtext linktgt += b'#' + idtext
@ -389,7 +391,7 @@ class Mobi8Reader(object):
href = None href = None
if typ in {b'FLIS', b'FCIS', b'SRCS', b'\xe9\x8e\r\n', if typ in {b'FLIS', b'FCIS', b'SRCS', b'\xe9\x8e\r\n',
b'RESC', b'BOUN', b'FDST', b'DATP', b'AUDI', b'VIDE'}: b'RESC', b'BOUN', b'FDST', b'DATP', b'AUDI', b'VIDE'}:
pass # Ignore these records pass # Ignore these records
elif typ == b'FONT': elif typ == b'FONT':
font = read_font_record(data) font = read_font_record(data)
href = "fonts/%05d.%s" % (fname_idx, font['ext']) href = "fonts/%05d.%s" % (fname_idx, font['ext'])
@ -406,7 +408,11 @@ class Mobi8Reader(object):
else: else:
imgtype = what(None, data) imgtype = what(None, data)
if imgtype is None: if imgtype is None:
imgtype = 'unknown' from calibre.utils.magick.draw import identify_data
try:
imgtype = identify_data(data)[2]
except Exception:
imgtype = 'unknown'
href = 'images/%05d.%s'%(fname_idx, imgtype) href = 'images/%05d.%s'%(fname_idx, imgtype)
with open(href.replace('/', os.sep), 'wb') as f: with open(href.replace('/', os.sep), 'wb') as f:
f.write(data) f.write(data)

View File

@ -19,7 +19,7 @@ from calibre.ebooks.mobi.reader.mobi8 import Mobi8Reader
from calibre.ebooks.conversion.plumber import Plumber, create_oebbook from calibre.ebooks.conversion.plumber import Plumber, create_oebbook
from calibre.customize.ui import (plugin_for_input_format, from calibre.customize.ui import (plugin_for_input_format,
plugin_for_output_format) plugin_for_output_format)
from calibre.utils.ipc.simple_worker import fork_job from calibre.utils.ipc.simple_worker import fork_job
class BadFormat(ValueError): class BadFormat(ValueError):
pass pass
@ -72,7 +72,8 @@ def explode(path, dest, question=lambda x:True):
dest), no_output=True)['result'] dest), no_output=True)['result']
def set_cover(oeb): def set_cover(oeb):
if 'cover' not in oeb.guide or oeb.metadata['cover']: return if 'cover' not in oeb.guide or oeb.metadata['cover']:
return
cover = oeb.guide['cover'] cover = oeb.guide['cover']
if cover.href in oeb.manifest.hrefs: if cover.href in oeb.manifest.hrefs:
item = oeb.manifest.hrefs[cover.href] item = oeb.manifest.hrefs[cover.href]
@ -95,8 +96,9 @@ def rebuild(src_dir, dest_path):
if not opf: if not opf:
raise ValueError('No OPF file found in %s'%src_dir) raise ValueError('No OPF file found in %s'%src_dir)
opf = opf[0] opf = opf[0]
# For debugging, uncomment the following line # For debugging, uncomment the following two lines
# def fork_job(a, b, args=None, no_output=True): do_rebuild(*args) # def fork_job(a, b, args=None, no_output=True):
# do_rebuild(*args)
fork_job('calibre.ebooks.mobi.tweak', 'do_rebuild', args=(opf, dest_path), fork_job('calibre.ebooks.mobi.tweak', 'do_rebuild', args=(opf, dest_path),
no_output=True) no_output=True)

View File

@ -69,7 +69,8 @@ class Resources(object):
cover_href = item.href cover_href = item.href
for item in self.oeb.manifest.values(): for item in self.oeb.manifest.values():
if item.media_type not in OEB_RASTER_IMAGES: continue if item.media_type not in OEB_RASTER_IMAGES:
continue
try: try:
data = self.process_image(item.data) data = self.process_image(item.data)
except: except:
@ -116,8 +117,8 @@ class Resources(object):
Add any images that were created after the call to add_resources() Add any images that were created after the call to add_resources()
''' '''
for item in self.oeb.manifest.values(): for item in self.oeb.manifest.values():
if (item.media_type not in OEB_RASTER_IMAGES or item.href in if (item.media_type not in OEB_RASTER_IMAGES or item.href in self.item_map):
self.item_map): continue continue
try: try:
data = self.process_image(item.data) data = self.process_image(item.data)
except: except:

View File

@ -270,7 +270,7 @@ BINARY_MIME = 'application/octet-stream'
XHTML_CSS_NAMESPACE = u'@namespace "%s";\n' % XHTML_NS XHTML_CSS_NAMESPACE = u'@namespace "%s";\n' % XHTML_NS
OEB_STYLES = set([CSS_MIME, OEB_CSS_MIME, 'text/x-oeb-css']) OEB_STYLES = set([CSS_MIME, OEB_CSS_MIME, 'text/x-oeb-css', 'xhtml/css'])
OEB_DOCS = set([XHTML_MIME, 'text/html', OEB_DOC_MIME, OEB_DOCS = set([XHTML_MIME, 'text/html', OEB_DOC_MIME,
'text/x-oeb-document']) 'text/x-oeb-document'])
OEB_RASTER_IMAGES = set([GIF_MIME, JPEG_MIME, PNG_MIME]) OEB_RASTER_IMAGES = set([GIF_MIME, JPEG_MIME, PNG_MIME])

View File

@ -43,8 +43,8 @@ sizes, adjust margins, etc. Every action performs only the minimum set of
changes needed for the desired effect.</p> changes needed for the desired effect.</p>
<p>You should use this tool as the last step in your ebook creation process.</p> <p>You should use this tool as the last step in your ebook creation process.</p>
{0}
<p>Note that polishing only works on files in the %s formats.</p> <p>Note that polishing only works on files in the %s formats.</p>\
''')%_(' or ').join('<b>%s</b>'%x for x in SUPPORTED), ''')%_(' or ').join('<b>%s</b>'%x for x in SUPPORTED),
'subset': _('''\ 'subset': _('''\
@ -69,7 +69,7 @@ text might not be covered by the subset font.</p>
'jacket': _('''\ 'jacket': _('''\
<p>Insert a "book jacket" page at the start of the book that contains <p>Insert a "book jacket" page at the start of the book that contains
all the book metadata such as title, tags, authors, series, comments, all the book metadata such as title, tags, authors, series, comments,
etc.</p>'''), etc. Any previous book jacket will be replaced.</p>'''),
'remove_jacket': _('''\ 'remove_jacket': _('''\
<p>Remove a previous inserted book jacket page.</p> <p>Remove a previous inserted book jacket page.</p>
@ -85,7 +85,7 @@ when single quotes at the start of contractions are involved.</p>
def hfix(name, raw): def hfix(name, raw):
if name == 'about': if name == 'about':
return raw return raw.format('')
raw = raw.replace('\n\n', '__XX__') raw = raw.replace('\n\n', '__XX__')
raw = raw.replace('\n', ' ') raw = raw.replace('\n', ' ')
raw = raw.replace('__XX__', '\n') raw = raw.replace('__XX__', '\n')

View File

@ -180,5 +180,6 @@ class BorderParse:
elif 'single' in border_style_list: elif 'single' in border_style_list:
new_border_dict[att] = 'single' new_border_dict[att] = 'single'
else: else:
new_border_dict[att] = border_style_list[0] if border_style_list:
new_border_dict[att] = border_style_list[0]
return new_border_dict return new_border_dict

View File

@ -10,8 +10,7 @@ from functools import partial
from threading import Thread from threading import Thread
from contextlib import closing from contextlib import closing
from PyQt4.Qt import (QToolButton, QDialog, QGridLayout, QIcon, QLabel, from PyQt4.Qt import (QToolButton, QDialog, QGridLayout, QIcon, QLabel, QDialogButtonBox)
QCheckBox, QDialogButtonBox)
from calibre.gui2.actions import InterfaceAction from calibre.gui2.actions import InterfaceAction
from calibre.gui2 import (error_dialog, Dispatcher, warning_dialog, gprefs, from calibre.gui2 import (error_dialog, Dispatcher, warning_dialog, gprefs,
@ -21,7 +20,7 @@ from calibre.gui2.widgets import HistoryLineEdit
from calibre.utils.config import prefs, tweaks from calibre.utils.config import prefs, tweaks
from calibre.utils.date import now from calibre.utils.date import now
class Worker(Thread): # {{{ class Worker(Thread): # {{{
def __init__(self, ids, db, loc, progress, done, delete_after): def __init__(self, ids, db, loc, progress, done, delete_after):
Thread.__init__(self) Thread.__init__(self)
@ -71,8 +70,10 @@ class Worker(Thread): # {{{
mi.timestamp = now() mi.timestamp = now()
self.progress(i, mi.title) self.progress(i, mi.title)
fmts = self.db.formats(x, index_is_id=True) fmts = self.db.formats(x, index_is_id=True)
if not fmts: fmts = [] if not fmts:
else: fmts = fmts.split(',') fmts = []
else:
fmts = fmts.split(',')
paths = [] paths = []
for fmt in fmts: for fmt in fmts:
p = self.db.format(x, fmt, index_is_id=True, p = self.db.format(x, fmt, index_is_id=True,
@ -82,7 +83,7 @@ class Worker(Thread): # {{{
automerged = False automerged = False
if prefs['add_formats_to_existing']: if prefs['add_formats_to_existing']:
identical_book_list = newdb.find_identical_books(mi) identical_book_list = newdb.find_identical_books(mi)
if identical_book_list: # books with same author and nearly same title exist in newdb if identical_book_list: # books with same author and nearly same title exist in newdb
self.auto_merged_ids[x] = _('%(title)s by %(author)s')%\ self.auto_merged_ids[x] = _('%(title)s by %(author)s')%\
dict(title=mi.title, author=mi.format_field('authors')[1]) dict(title=mi.title, author=mi.format_field('authors')[1])
automerged = True automerged = True
@ -127,7 +128,7 @@ class Worker(Thread): # {{{
# }}} # }}}
class ChooseLibrary(QDialog): # {{{ class ChooseLibrary(QDialog): # {{{
def __init__(self, parent): def __init__(self, parent):
super(ChooseLibrary, self).__init__(parent) super(ChooseLibrary, self).__init__(parent)
@ -146,12 +147,19 @@ class ChooseLibrary(QDialog): # {{{
b.setToolTip(_('Browse for library')) b.setToolTip(_('Browse for library'))
b.clicked.connect(self.browse) b.clicked.connect(self.browse)
l.addWidget(b, 0, 2) l.addWidget(b, 0, 2)
self.c = c = QCheckBox(_('&Delete after copy')) self.bb = bb = QDialogButtonBox(QDialogButtonBox.Cancel)
l.addWidget(c, 1, 0, 1, 3)
self.bb = bb = QDialogButtonBox(QDialogButtonBox.Ok|QDialogButtonBox.Cancel)
bb.accepted.connect(self.accept) bb.accepted.connect(self.accept)
bb.rejected.connect(self.reject) bb.rejected.connect(self.reject)
l.addWidget(bb, 2, 0, 1, 3) self.delete_after_copy = False
b = bb.addButton(_('&Copy'), bb.AcceptRole)
b.setIcon(QIcon(I('edit-copy.png')))
b.setToolTip(_('Copy to the specified library'))
b2 = bb.addButton(_('&Move'), bb.AcceptRole)
b2.clicked.connect(lambda: setattr(self, 'delete_after_copy', True))
b2.setIcon(QIcon(I('edit-cut.png')))
b2.setToolTip(_('Copy to the specified library and delete from the current library'))
b.setDefault(True)
l.addWidget(bb, 1, 0, 1, 3)
le.setMinimumWidth(350) le.setMinimumWidth(350)
self.resize(self.sizeHint()) self.resize(self.sizeHint())
@ -163,7 +171,7 @@ class ChooseLibrary(QDialog): # {{{
@property @property
def args(self): def args(self):
return (unicode(self.le.text()), self.c.isChecked()) return (unicode(self.le.text()), self.delete_after_copy)
# }}} # }}}
class CopyToLibraryAction(InterfaceAction): class CopyToLibraryAction(InterfaceAction):
@ -204,7 +212,7 @@ class CopyToLibraryAction(InterfaceAction):
self.menu.addAction(name, partial(self.copy_to_library, self.menu.addAction(name, partial(self.copy_to_library,
loc)) loc))
self.menu.addAction(name + ' ' + _('(delete after copy)'), self.menu.addAction(name + ' ' + _('(delete after copy)'),
partial(self.copy_to_library, loc, delete_after=True)) partial(self.copy_to_library, loc, delete_after=True))
self.menu.addSeparator() self.menu.addSeparator()
self.menu.addAction(_('Choose library by path...'), self.choose_library) self.menu.addAction(_('Choose library by path...'), self.choose_library)
@ -214,6 +222,8 @@ class CopyToLibraryAction(InterfaceAction):
d = ChooseLibrary(self.gui) d = ChooseLibrary(self.gui)
if d.exec_() == d.Accepted: if d.exec_() == d.Accepted:
path, delete_after = d.args path, delete_after = d.args
if not path:
return
db = self.gui.library_view.model().db db = self.gui.library_view.model().db
current = os.path.normcase(os.path.abspath(db.library_path)) current = os.path.normcase(os.path.abspath(db.library_path))
if current == os.path.normcase(os.path.abspath(path)): if current == os.path.normcase(os.path.abspath(path)):

View File

@ -180,6 +180,13 @@ class DeleteAction(InterfaceAction):
self.gui.library_view.currentIndex()) self.gui.library_view.currentIndex())
self.gui.tags_view.recount() self.gui.tags_view.recount()
def restore_format(self, book_id, original_fmt):
self.gui.current_db.restore_original_format(book_id, original_fmt)
self.gui.library_view.model().refresh_ids([book_id])
self.gui.library_view.model().current_changed(self.gui.library_view.currentIndex(),
self.gui.library_view.currentIndex())
self.gui.tags_view.recount()
def delete_selected_formats(self, *args): def delete_selected_formats(self, *args):
ids = self._get_selected_ids() ids = self._get_selected_ids()
if not ids: if not ids:

View File

@ -279,7 +279,7 @@ class EditMetadataAction(InterfaceAction):
''' '''
Edit metadata of selected books in library in bulk. Edit metadata of selected books in library in bulk.
''' '''
rows = [r.row() for r in \ rows = [r.row() for r in
self.gui.library_view.selectionModel().selectedRows()] self.gui.library_view.selectionModel().selectedRows()]
m = self.gui.library_view.model() m = self.gui.library_view.model()
ids = [m.id(r) for r in rows] ids = [m.id(r) for r in rows]
@ -469,45 +469,39 @@ class EditMetadataAction(InterfaceAction):
if not had_orig_cover and dest_cover: if not had_orig_cover and dest_cover:
db.set_cover(dest_id, dest_cover) db.set_cover(dest_id, dest_cover)
for key in db.field_metadata: #loop thru all defined fields for key in db.field_metadata: # loop thru all defined fields
if db.field_metadata[key]['is_custom']: fm = db.field_metadata[key]
colnum = db.field_metadata[key]['colnum'] if not fm['is_custom']:
continue
dt = fm['datatype']
colnum = fm['colnum']
# Get orig_dest_comments before it gets changed # Get orig_dest_comments before it gets changed
if db.field_metadata[key]['datatype'] == 'comments': if dt == 'comments':
orig_dest_value = db.get_custom(dest_id, num=colnum, index_is_id=True) orig_dest_value = db.get_custom(dest_id, num=colnum, index_is_id=True)
for src_id in src_ids: for src_id in src_ids:
dest_value = db.get_custom(dest_id, num=colnum, index_is_id=True) dest_value = db.get_custom(dest_id, num=colnum, index_is_id=True)
src_value = db.get_custom(src_id, num=colnum, index_is_id=True) src_value = db.get_custom(src_id, num=colnum, index_is_id=True)
if db.field_metadata[key]['datatype'] == 'comments': if (dt == 'comments' and src_value and src_value != orig_dest_value):
if src_value and src_value != orig_dest_value: if not dest_value:
if not dest_value: db.set_custom(dest_id, src_value, num=colnum)
else:
dest_value = unicode(dest_value) + u'\n\n' + unicode(src_value)
db.set_custom(dest_id, dest_value, num=colnum)
if (dt in {'bool', 'int', 'float', 'rating', 'datetime'} and dest_value is None):
db.set_custom(dest_id, src_value, num=colnum) db.set_custom(dest_id, src_value, num=colnum)
else: if (dt == 'series' and not dest_value and src_value):
dest_value = unicode(dest_value) + u'\n\n' + unicode(src_value) src_index = db.get_custom_extra(src_id, num=colnum, index_is_id=True)
db.set_custom(dest_id, src_value, num=colnum, extra=src_index)
if (dt == 'enumeration' or (dt == 'text' and not fm['is_multiple']) and not dest_value):
db.set_custom(dest_id, src_value, num=colnum)
if (dt == 'text' and fm['is_multiple'] and src_value):
if not dest_value:
dest_value = src_value
else:
dest_value.extend(src_value)
db.set_custom(dest_id, dest_value, num=colnum) db.set_custom(dest_id, dest_value, num=colnum)
if db.field_metadata[key]['datatype'] in \ # }}}
('bool', 'int', 'float', 'rating', 'datetime') \
and dest_value is None:
db.set_custom(dest_id, src_value, num=colnum)
if db.field_metadata[key]['datatype'] == 'series' \
and not dest_value:
if src_value:
src_index = db.get_custom_extra(src_id, num=colnum, index_is_id=True)
db.set_custom(dest_id, src_value, num=colnum, extra=src_index)
if (db.field_metadata[key]['datatype'] == 'enumeration' or
(db.field_metadata[key]['datatype'] == 'text' and
not db.field_metadata[key]['is_multiple'])
and not dest_value):
db.set_custom(dest_id, src_value, num=colnum)
if db.field_metadata[key]['datatype'] == 'text' \
and db.field_metadata[key]['is_multiple']:
if src_value:
if not dest_value:
dest_value = src_value
else:
dest_value.extend(src_value)
db.set_custom(dest_id, dest_value, num=colnum)
# }}}
def edit_device_collections(self, view, oncard=None): def edit_device_collections(self, view, oncard=None):
model = view.model() model = view.model()
@ -515,8 +509,8 @@ class EditMetadataAction(InterfaceAction):
d = DeviceCategoryEditor(self.gui, tag_to_match=None, data=result, key=sort_key) d = DeviceCategoryEditor(self.gui, tag_to_match=None, data=result, key=sort_key)
d.exec_() d.exec_()
if d.result() == d.Accepted: if d.result() == d.Accepted:
to_rename = d.to_rename # dict of new text to old ids to_rename = d.to_rename # dict of new text to old ids
to_delete = d.to_delete # list of ids to_delete = d.to_delete # list of ids
for old_id, new_name in to_rename.iteritems(): for old_id, new_name in to_rename.iteritems():
model.rename_collection(old_id, new_name=unicode(new_name)) model.rename_collection(old_id, new_name=unicode(new_name))
for item in to_delete: for item in to_delete:
@ -585,7 +579,6 @@ class EditMetadataAction(InterfaceAction):
self.apply_pd.value += 1 self.apply_pd.value += 1
QTimer.singleShot(50, self.do_one_apply) QTimer.singleShot(50, self.do_one_apply)
def apply_mi(self, book_id, mi): def apply_mi(self, book_id, mi):
db = self.gui.current_db db = self.gui.current_db

View File

@ -37,7 +37,13 @@ class Polish(QDialog): # {{{
self.setWindowTitle(title) self.setWindowTitle(title)
self.help_text = { self.help_text = {
'polish': _('<h3>About Polishing books</h3>%s')%HELP['about'], 'polish': _('<h3>About Polishing books</h3>%s')%HELP['about'].format(
_('''<p>If you have both EPUB and ORIGINAL_EPUB in your book,
then polishing will run on ORIGINAL_EPUB (the same for other
ORIGINAL_* formats). So if you
want Polishing to not run on the ORIGINAL_* format, delete the
ORIGINAL_* format before running it.</p>''')
),
'subset':_('<h3>Subsetting fonts</h3>%s')%HELP['subset'], 'subset':_('<h3>Subsetting fonts</h3>%s')%HELP['subset'],

View File

@ -88,9 +88,7 @@ class StoreAction(InterfaceAction):
if row == None: if row == None:
error_dialog(self.gui, _('Cannot search'), _('No book selected'), show=True) error_dialog(self.gui, _('Cannot search'), _('No book selected'), show=True)
return return
self.search({ 'author': self._get_author(row) })
query = 'author:"%s"' % self._get_author(row)
self.search(query)
def _get_title(self, row): def _get_title(self, row):
title = '' title = ''
@ -107,18 +105,14 @@ class StoreAction(InterfaceAction):
if row == None: if row == None:
error_dialog(self.gui, _('Cannot search'), _('No book selected'), show=True) error_dialog(self.gui, _('Cannot search'), _('No book selected'), show=True)
return return
self.search({ 'title': self._get_title(row) })
query = 'title:"%s"' % self._get_title(row)
self.search(query)
def search_author_title(self): def search_author_title(self):
row = self._get_selected_row() row = self._get_selected_row()
if row == None: if row == None:
error_dialog(self.gui, _('Cannot search'), _('No book selected'), show=True) error_dialog(self.gui, _('Cannot search'), _('No book selected'), show=True)
return return
self.search({ 'author': self._get_author(row), 'title': self._get_title(row) })
query = 'author:"%s" title:"%s"' % (self._get_author(row), self._get_title(row))
self.search(query)
def choose(self): def choose(self):
from calibre.gui2.store.config.chooser.chooser_dialog import StoreChooserDialog from calibre.gui2.store.config.chooser.chooser_dialog import StoreChooserDialog

View File

@ -405,6 +405,7 @@ class BookInfo(QWebView):
link_clicked = pyqtSignal(object) link_clicked = pyqtSignal(object)
remove_format = pyqtSignal(int, object) remove_format = pyqtSignal(int, object)
save_format = pyqtSignal(int, object) save_format = pyqtSignal(int, object)
restore_format = pyqtSignal(int, object)
def __init__(self, vertical, parent=None): def __init__(self, vertical, parent=None):
QWebView.__init__(self, parent) QWebView.__init__(self, parent)
@ -418,7 +419,7 @@ class BookInfo(QWebView):
palette.setBrush(QPalette.Base, Qt.transparent) palette.setBrush(QPalette.Base, Qt.transparent)
self.page().setPalette(palette) self.page().setPalette(palette)
self.css = P('templates/book_details.css', data=True).decode('utf-8') self.css = P('templates/book_details.css', data=True).decode('utf-8')
for x, icon in [('remove', 'trash.png'), ('save', 'save.png')]: for x, icon in [('remove', 'trash.png'), ('save', 'save.png'), ('restore', 'edit-undo.png')]:
ac = QAction(QIcon(I(icon)), '', self) ac = QAction(QIcon(I(icon)), '', self)
ac.current_fmt = None ac.current_fmt = None
ac.triggered.connect(getattr(self, '%s_format_triggerred'%x)) ac.triggered.connect(getattr(self, '%s_format_triggerred'%x))
@ -436,6 +437,9 @@ class BookInfo(QWebView):
def save_format_triggerred(self): def save_format_triggerred(self):
self.context_action_triggered('save') self.context_action_triggered('save')
def restore_format_triggerred(self):
self.context_action_triggered('restore')
def link_activated(self, link): def link_activated(self, link):
self._link_clicked = True self._link_clicked = True
if unicode(link.scheme()) in ('http', 'https'): if unicode(link.scheme()) in ('http', 'https'):
@ -479,7 +483,11 @@ class BookInfo(QWebView):
traceback.print_exc() traceback.print_exc()
else: else:
for a, t in [('remove', _('Delete the %s format')), for a, t in [('remove', _('Delete the %s format')),
('save', _('Save the %s format to disk'))]: ('save', _('Save the %s format to disk')),
('restore', _('Restore the %s format')),
]:
if a == 'restore' and not fmt.upper().startswith('ORIGINAL_'):
continue
ac = getattr(self, '%s_format_action'%a) ac = getattr(self, '%s_format_action'%a)
ac.current_fmt = (book_id, fmt) ac.current_fmt = (book_id, fmt)
ac.setText(t%parts[2]) ac.setText(t%parts[2])
@ -585,6 +593,7 @@ class BookDetails(QWidget): # {{{
view_specific_format = pyqtSignal(int, object) view_specific_format = pyqtSignal(int, object)
remove_specific_format = pyqtSignal(int, object) remove_specific_format = pyqtSignal(int, object)
save_specific_format = pyqtSignal(int, object) save_specific_format = pyqtSignal(int, object)
restore_specific_format = pyqtSignal(int, object)
remote_file_dropped = pyqtSignal(object, object) remote_file_dropped = pyqtSignal(object, object)
files_dropped = pyqtSignal(object, object) files_dropped = pyqtSignal(object, object)
cover_changed = pyqtSignal(object, object) cover_changed = pyqtSignal(object, object)
@ -654,6 +663,7 @@ class BookDetails(QWidget): # {{{
self.book_info.link_clicked.connect(self.handle_click) self.book_info.link_clicked.connect(self.handle_click)
self.book_info.remove_format.connect(self.remove_specific_format) self.book_info.remove_format.connect(self.remove_specific_format)
self.book_info.save_format.connect(self.save_specific_format) self.book_info.save_format.connect(self.save_specific_format)
self.book_info.restore_format.connect(self.restore_specific_format)
self.setCursor(Qt.PointingHandCursor) self.setCursor(Qt.PointingHandCursor)
def handle_click(self, link): def handle_click(self, link):

View File

@ -272,6 +272,8 @@ class LayoutMixin(object): # {{{
self.iactions['Remove Books'].remove_format_by_id) self.iactions['Remove Books'].remove_format_by_id)
self.book_details.save_specific_format.connect( self.book_details.save_specific_format.connect(
self.iactions['Save To Disk'].save_library_format_by_ids) self.iactions['Save To Disk'].save_library_format_by_ids)
self.book_details.restore_specific_format.connect(
self.iactions['Remove Books'].restore_format)
self.book_details.view_device_book.connect( self.book_details.view_device_book.connect(
self.iactions['View'].view_device_book) self.iactions['View'].view_device_book)

View File

@ -18,7 +18,8 @@ from calibre.gui2.dialogs.message_box import ViewLog
Question = namedtuple('Question', 'payload callback cancel_callback ' Question = namedtuple('Question', 'payload callback cancel_callback '
'title msg html_log log_viewer_title log_is_file det_msg ' 'title msg html_log log_viewer_title log_is_file det_msg '
'show_copy_button checkbox_msg checkbox_checked') 'show_copy_button checkbox_msg checkbox_checked action_callback '
'action_label action_icon')
class ProceedQuestion(QDialog): class ProceedQuestion(QDialog):
@ -51,6 +52,8 @@ class ProceedQuestion(QDialog):
self.copy_button = self.bb.addButton(_('&Copy to clipboard'), self.copy_button = self.bb.addButton(_('&Copy to clipboard'),
self.bb.ActionRole) self.bb.ActionRole)
self.copy_button.clicked.connect(self.copy_to_clipboard) self.copy_button.clicked.connect(self.copy_to_clipboard)
self.action_button = self.bb.addButton('', self.bb.ActionRole)
self.action_button.clicked.connect(self.action_clicked)
self.show_det_msg = _('Show &details') self.show_det_msg = _('Show &details')
self.hide_det_msg = _('Hide &details') self.hide_det_msg = _('Hide &details')
self.det_msg_toggle = self.bb.addButton(self.show_det_msg, self.bb.ActionRole) self.det_msg_toggle = self.bb.addButton(self.show_det_msg, self.bb.ActionRole)
@ -81,6 +84,12 @@ class ProceedQuestion(QDialog):
unicode(self.det_msg.toPlainText()))) unicode(self.det_msg.toPlainText())))
self.copy_button.setText(_('Copied')) self.copy_button.setText(_('Copied'))
def action_clicked(self):
if self.questions:
q = self.questions[0]
self.questions[0] = q._replace(callback=q.action_callback)
self.accept()
def accept(self): def accept(self):
if self.questions: if self.questions:
payload, callback, cancel_callback = self.questions[0][:3] payload, callback, cancel_callback = self.questions[0][:3]
@ -123,13 +132,19 @@ class ProceedQuestion(QDialog):
self.resize(sz) self.resize(sz)
def show_question(self): def show_question(self):
if self.isVisible(): return if self.isVisible():
return
if self.questions: if self.questions:
question = self.questions[0] question = self.questions[0]
self.msg_label.setText(question.msg) self.msg_label.setText(question.msg)
self.setWindowTitle(question.title) self.setWindowTitle(question.title)
self.log_button.setVisible(bool(question.html_log)) self.log_button.setVisible(bool(question.html_log))
self.copy_button.setVisible(bool(question.show_copy_button)) self.copy_button.setVisible(bool(question.show_copy_button))
self.action_button.setVisible(question.action_callback is not None)
if question.action_callback is not None:
self.action_button.setText(question.action_label or '')
self.action_button.setIcon(
QIcon() if question.action_icon is None else question.action_icon)
self.det_msg.setPlainText(question.det_msg or '') self.det_msg.setPlainText(question.det_msg or '')
self.det_msg.setVisible(False) self.det_msg.setVisible(False)
self.det_msg_toggle.setVisible(bool(question.det_msg)) self.det_msg_toggle.setVisible(bool(question.det_msg))
@ -145,7 +160,8 @@ class ProceedQuestion(QDialog):
def __call__(self, callback, payload, html_log, log_viewer_title, title, def __call__(self, callback, payload, html_log, log_viewer_title, title,
msg, det_msg='', show_copy_button=False, cancel_callback=None, msg, det_msg='', show_copy_button=False, cancel_callback=None,
log_is_file=False, checkbox_msg=None, checkbox_checked=False): log_is_file=False, checkbox_msg=None, checkbox_checked=False,
action_callback=None, action_label=None, action_icon=None):
''' '''
A non modal popup that notifies the user that a background task has A non modal popup that notifies the user that a background task has
been completed. This class guarantees that only a single popup is been completed. This class guarantees that only a single popup is
@ -170,11 +186,19 @@ class ProceedQuestion(QDialog):
called with both the payload and the state of the called with both the payload and the state of the
checkbox as arguments. checkbox as arguments.
:param checkbox_checked: If True the checkbox is checked by default. :param checkbox_checked: If True the checkbox is checked by default.
:param action_callback: If not None, an extra button is added, which
when clicked will cause action_callback to be called
instead of callback. action_callback is called in
exactly the same way as callback.
:param action_label: The text on the action button
:param action_icon: The icon for the action button, must be a QIcon object or None
''' '''
question = Question(payload, callback, cancel_callback, title, msg, question = Question(
html_log, log_viewer_title, log_is_file, det_msg, payload, callback, cancel_callback, title, msg, html_log,
show_copy_button, checkbox_msg, checkbox_checked) log_viewer_title, log_is_file, det_msg, show_copy_button,
checkbox_msg, checkbox_checked, action_callback, action_label,
action_icon)
self.questions.append(question) self.questions.append(question)
self.show_question() self.show_question()

View File

@ -62,16 +62,20 @@ class SearchDialog(QDialog, Ui_Dialog):
self.setup_store_checks() self.setup_store_checks()
# Set the search query # Set the search query
if isinstance(query, (str, unicode)):
self.search_edit.setText(query)
elif isinstance(query, dict):
if 'author' in query:
self.search_author.setText(query['author'])
if 'title' in query:
self.search_title.setText(query['title'])
# Title # Title
self.search_title.setText(query)
self.search_title.setSizeAdjustPolicy(QComboBox.AdjustToMinimumContentsLengthWithIcon) self.search_title.setSizeAdjustPolicy(QComboBox.AdjustToMinimumContentsLengthWithIcon)
self.search_title.setMinimumContentsLength(25) self.search_title.setMinimumContentsLength(25)
# Author # Author
self.search_author.setText(query)
self.search_author.setSizeAdjustPolicy(QComboBox.AdjustToMinimumContentsLengthWithIcon) self.search_author.setSizeAdjustPolicy(QComboBox.AdjustToMinimumContentsLengthWithIcon)
self.search_author.setMinimumContentsLength(25) self.search_author.setMinimumContentsLength(25)
# Keyword # Keyword
self.search_edit.setText(query)
self.search_edit.setSizeAdjustPolicy(QComboBox.AdjustToMinimumContentsLengthWithIcon) self.search_edit.setSizeAdjustPolicy(QComboBox.AdjustToMinimumContentsLengthWithIcon)
self.search_edit.setMinimumContentsLength(25) self.search_edit.setMinimumContentsLength(25)
@ -408,7 +412,7 @@ class SearchDialog(QDialog, Ui_Dialog):
self.save_state() self.save_state()
def exec_(self): def exec_(self):
if unicode(self.search_edit.text()).strip(): if unicode(self.search_edit.text()).strip() or unicode(self.search_title.text()).strip() or unicode(self.search_author.text()).strip():
self.do_search() self.do_search()
return QDialog.exec_(self) return QDialog.exec_(self)

View File

@ -1,91 +1,104 @@
# -*- coding: utf-8 -*- # -*- coding: utf-8 -*-
from __future__ import (unicode_literals, division, absolute_import, print_function) from __future__ import (unicode_literals, division, absolute_import, print_function)
store_version = 2 # Needed for dynamic plugin loading store_version = 3 # Needed for dynamic plugin loading
__license__ = 'GPL 3' __license__ = 'GPL 3'
__copyright__ = '2011, John Schember <john@nachtimwald.com>' __copyright__ = '2011, 2013, John Schember <john@nachtimwald.com>'
__docformat__ = 'restructuredtext en' __docformat__ = 'restructuredtext en'
import base64
import mimetypes import mimetypes
import re
import urllib import urllib
from contextlib import closing from contextlib import closing
from lxml import html from lxml import etree
from PyQt4.Qt import QUrl from calibre import browser, url_slash_cleaner
from calibre.constants import __version__
from calibre import browser, random_user_agent, url_slash_cleaner
from calibre.gui2 import open_url
from calibre.gui2.store import StorePlugin
from calibre.gui2.store.basic_config import BasicStoreConfig from calibre.gui2.store.basic_config import BasicStoreConfig
from calibre.gui2.store.opensearch_store import OpenSearchOPDSStore
from calibre.gui2.store.search_result import SearchResult from calibre.gui2.store.search_result import SearchResult
from calibre.gui2.store.web_store_dialog import WebStoreDialog
class GutenbergStore(BasicStoreConfig, StorePlugin): class GutenbergStore(BasicStoreConfig, OpenSearchOPDSStore):
def open(self, parent=None, detail_item=None, external=False): open_search_url = 'http://www.gutenberg.org/catalog/osd-books.xml'
url = 'http://gutenberg.org/' web_url = 'http://m.gutenberg.org/'
if detail_item:
detail_item = url_slash_cleaner(url + detail_item)
if external or self.config.get('open_external', False):
open_url(QUrl(detail_item if detail_item else url))
else:
d = WebStoreDialog(self.gui, url, parent, detail_item)
d.setWindowTitle(self.name)
d.set_tags(self.config.get('tags', ''))
d.exec_()
def search(self, query, max_results=10, timeout=60): def search(self, query, max_results=10, timeout=60):
url = 'http://m.gutenberg.org/ebooks/search.mobile/?default_prefix=all&sort_order=title&query=' + urllib.quote_plus(query) '''
Gutenberg's ODPS feed is poorly implmented and has a number of issues
which require very special handling to fix the results.
br = browser(user_agent=random_user_agent()) Issues:
* "Sort Alphabetically" and "Sort by Release Date" are returned
as book entries.
* The author is put into a "content" tag and not the author tag.
* The link to the book itself goes to an odps page which we need
to turn into a link to a web page.
* acquisition links are not part of the search result so we have
to go to the odps item itself. Detail item pages have a nasty
note saying:
DON'T USE THIS PAGE FOR SCRAPING.
Seriously. You'll only get your IP blocked.
We're using the ODPS feed because people are getting blocked with
the previous implementation so due to this using ODPS probably
won't solve this issue.
* Images are not links but base64 encoded strings. They are also not
real cover images but a little blue book thumbnail.
'''
url = 'http://m.gutenberg.org/ebooks/search.opds/?query=' + urllib.quote_plus(query)
counter = max_results counter = max_results
br = browser(user_agent='calibre/'+__version__)
with closing(br.open(url, timeout=timeout)) as f: with closing(br.open(url, timeout=timeout)) as f:
doc = html.fromstring(f.read()) doc = etree.fromstring(f.read())
for data in doc.xpath('//ol[@class="results"]/li[@class="booklink"]'): for data in doc.xpath('//*[local-name() = "entry"]'):
if counter <= 0: if counter <= 0:
break break
id = ''.join(data.xpath('./a/@href'))
id = id.split('.mobile')[0]
title = ''.join(data.xpath('.//span[@class="title"]/text()'))
author = ''.join(data.xpath('.//span[@class="subtitle"]/text()'))
counter -= 1 counter -= 1
s = SearchResult() s = SearchResult()
s.cover_url = ''
s.detail_item = id.strip() # We could use the <link rel="alternate" type="text/html" ...> tag from the
s.title = title.strip() # detail odps page but this is easier.
s.author = author.strip() id = ''.join(data.xpath('./*[local-name() = "id"]/text()')).strip()
s.price = '$0.00' s.detail_item = url_slash_cleaner('%s/ebooks/%s' % (self.web_url, re.sub('[^\d]', '', id)))
s.drm = SearchResult.DRM_UNLOCKED if not s.detail_item:
continue
s.title = ' '.join(data.xpath('./*[local-name() = "title"]//text()')).strip()
s.author = ', '.join(data.xpath('./*[local-name() = "content"]//text()')).strip()
if not s.title or not s.author:
continue
# Get the formats and direct download links.
with closing(br.open(id, timeout=timeout/4)) as nf:
ndoc = etree.fromstring(nf.read())
for link in ndoc.xpath('//*[local-name() = "link" and @rel = "http://opds-spec.org/acquisition"]'):
type = link.get('type')
href = link.get('href')
if type:
ext = mimetypes.guess_extension(type)
if ext:
ext = ext[1:].upper().strip()
s.downloads[ext] = href
s.formats = ', '.join(s.downloads.keys())
if not s.formats:
continue
for link in data.xpath('./*[local-name() = "link"]'):
rel = link.get('rel')
href = link.get('href')
type = link.get('type')
if rel and href and type:
if rel in ('http://opds-spec.org/thumbnail', 'http://opds-spec.org/image/thumbnail'):
if href.startswith('data:image/png;base64,'):
s.cover_data = base64.b64decode(href.replace('data:image/png;base64,', ''))
yield s yield s
def get_details(self, search_result, timeout):
url = url_slash_cleaner('http://m.gutenberg.org/' + search_result.detail_item)
br = browser(user_agent=random_user_agent())
with closing(br.open(url, timeout=timeout)) as nf:
doc = html.fromstring(nf.read())
for save_item in doc.xpath('//li[contains(@class, "icon_save")]/a'):
type = save_item.get('type')
href = save_item.get('href')
if type:
ext = mimetypes.guess_extension(type)
if ext:
ext = ext[1:].upper().strip()
search_result.downloads[ext] = href
search_result.formats = ', '.join(search_result.downloads.keys())
return True

View File

@ -0,0 +1,82 @@
# -*- coding: utf-8 -*-
from __future__ import (division, absolute_import, print_function)
store_version = 1 # Needed for dynamic plugin loading
__license__ = 'GPL 3'
__copyright__ = '2013, Tomasz Długosz <tomek3d@gmail.com>'
__docformat__ = 'restructuredtext en'
import urllib
from base64 import b64encode
from contextlib import closing
from lxml import html
from PyQt4.Qt import QUrl
from calibre import browser, url_slash_cleaner
from calibre.gui2 import open_url
from calibre.gui2.store import StorePlugin
from calibre.gui2.store.basic_config import BasicStoreConfig
from calibre.gui2.store.search_result import SearchResult
from calibre.gui2.store.web_store_dialog import WebStoreDialog
class KoobeStore(BasicStoreConfig, StorePlugin):
def open(self, parent=None, detail_item=None, external=False):
aff_root = 'https://www.a4b-tracking.com/pl/stat-click-text-link/15/58/'
url = 'http://www.koobe.pl/'
aff_url = aff_root + str(b64encode(url))
detail_url = None
if detail_item:
detail_url = aff_root + str(b64encode(detail_item))
if external or self.config.get('open_external', False):
open_url(QUrl(url_slash_cleaner(detail_url if detail_url else aff_url)))
else:
d = WebStoreDialog(self.gui, url, parent, detail_url if detail_url else aff_url)
d.setWindowTitle(self.name)
d.set_tags(self.config.get('tags', ''))
d.exec_()
def search(self, query, max_results=10, timeout=60):
br = browser()
page=1
counter = max_results
while counter:
with closing(br.open('http://www.koobe.pl/s,p,' + str(page) + ',szukaj/fraza:' + urllib.quote(query), timeout=timeout)) as f:
doc = html.fromstring(f.read().decode('utf-8'))
for data in doc.xpath('//div[@class="seach_result"]/div[@class="result"]'):
if counter <= 0:
break
id = ''.join(data.xpath('.//div[@class="cover"]/a/@href'))
if not id:
continue
cover_url = ''.join(data.xpath('.//div[@class="cover"]/a/img/@src'))
price = ''.join(data.xpath('.//span[@class="current_price"]/text()'))
title = ''.join(data.xpath('.//h2[@class="title"]/a/text()'))
author = ''.join(data.xpath('.//h3[@class="book_author"]/a/text()'))
formats = ', '.join(data.xpath('.//div[@class="formats"]/div/div/@title'))
counter -= 1
s = SearchResult()
s.cover_url = 'http://koobe.pl/' + cover_url
s.title = title.strip()
s.author = author.strip()
s.price = price
s.detail_item = 'http://koobe.pl' + id[1:]
s.formats = formats.upper()
s.drm = SearchResult.DRM_UNLOCKED
yield s
if not doc.xpath('//div[@class="site_bottom"]//a[@class="right"]'):
break
page+=1

View File

@ -1,7 +1,7 @@
# -*- coding: utf-8 -*- # -*- coding: utf-8 -*-
from __future__ import (unicode_literals, division, absolute_import, print_function) from __future__ import (unicode_literals, division, absolute_import, print_function)
store_version = 2 # Needed for dynamic plugin loading store_version = 3 # Needed for dynamic plugin loading
__license__ = 'GPL 3' __license__ = 'GPL 3'
__copyright__ = '2011-2013, Tomasz Długosz <tomek3d@gmail.com>' __copyright__ = '2011-2013, Tomasz Długosz <tomek3d@gmail.com>'
@ -67,7 +67,7 @@ class NextoStore(BasicStoreConfig, StorePlugin):
cover_url = ''.join(data.xpath('.//img[@class="cover"]/@src')) cover_url = ''.join(data.xpath('.//img[@class="cover"]/@src'))
cover_url = re.sub(r'%2F', '/', cover_url) cover_url = re.sub(r'%2F', '/', cover_url)
cover_url = re.sub(r'\widthMax=120&heightMax=200', 'widthMax=64&heightMax=64', cover_url) cover_url = re.sub(r'widthMax=120&heightMax=200', 'widthMax=64&heightMax=64', cover_url)
title = ''.join(data.xpath('.//a[@class="title"]/text()')) title = ''.join(data.xpath('.//a[@class="title"]/text()'))
title = re.sub(r' - ebook$', '', title) title = re.sub(r' - ebook$', '', title)
formats = ', '.join(data.xpath('.//ul[@class="formats_available"]/li//b/text()')) formats = ', '.join(data.xpath('.//ul[@class="formats_available"]/li//b/text()'))
@ -82,7 +82,7 @@ class NextoStore(BasicStoreConfig, StorePlugin):
counter -= 1 counter -= 1
s = SearchResult() s = SearchResult()
s.cover_url = 'http://www.nexto.pl' + cover_url s.cover_url = cover_url if cover_url[:4] == 'http' else 'http://www.nexto.pl' + cover_url
s.title = title.strip() s.title = title.strip()
s.author = author.strip() s.author = author.strip()
s.price = price s.price = price

View File

@ -1,7 +1,7 @@
# -*- coding: utf-8 -*- # -*- coding: utf-8 -*-
from __future__ import (unicode_literals, division, absolute_import, print_function) from __future__ import (unicode_literals, division, absolute_import, print_function)
store_version = 2 # Needed for dynamic plugin loading store_version = 3 # Needed for dynamic plugin loading
__license__ = 'GPL 3' __license__ = 'GPL 3'
__copyright__ = '2011-2013, Tomasz Długosz <tomek3d@gmail.com>' __copyright__ = '2011-2013, Tomasz Długosz <tomek3d@gmail.com>'
@ -41,7 +41,7 @@ class VirtualoStore(BasicStoreConfig, StorePlugin):
url = 'http://virtualo.pl/?q=' + urllib.quote(query) + '&f=format_id:4,6,3' url = 'http://virtualo.pl/?q=' + urllib.quote(query) + '&f=format_id:4,6,3'
br = browser() br = browser()
no_drm_pattern = re.compile("Znak wodny") no_drm_pattern = re.compile(r'Znak wodny|Brak')
counter = max_results counter = max_results
with closing(br.open(url, timeout=timeout)) as f: with closing(br.open(url, timeout=timeout)) as f:
@ -58,8 +58,8 @@ class VirtualoStore(BasicStoreConfig, StorePlugin):
cover_url = ''.join(data.xpath('.//div[@class="list_middle_left"]//a//img/@src')) cover_url = ''.join(data.xpath('.//div[@class="list_middle_left"]//a//img/@src'))
title = ''.join(data.xpath('.//div[@class="list_title list_text_left"]/a/text()')) title = ''.join(data.xpath('.//div[@class="list_title list_text_left"]/a/text()'))
author = ', '.join(data.xpath('.//div[@class="list_authors list_text_left"]/a/text()')) author = ', '.join(data.xpath('.//div[@class="list_authors list_text_left"]/a/text()'))
formats = [ form.split('_')[-1].replace('.png', '') for form in data.xpath('.//div[@style="width:55%;float:left;text-align:left;height:18px;"]//a/img/@src')] formats = [ form.split('_')[-1].replace('.png', '') for form in data.xpath('.//div[@style="width:55%;float:left;text-align:left;height:18px;"]//a/span/img/@src')]
nodrm = no_drm_pattern.search(''.join(data.xpath('.//div[@style="width:45%;float:right;text-align:right;height:18px;"]/div/div/text()'))) nodrm = no_drm_pattern.search(''.join(data.xpath('.//div[@style="width:45%;float:right;text-align:right;height:18px;"]//span[@class="prompt_preview"]/text()')))
counter -= 1 counter -= 1
@ -70,6 +70,6 @@ class VirtualoStore(BasicStoreConfig, StorePlugin):
s.price = price + '' s.price = price + ''
s.detail_item = 'http://virtualo.pl' + id.strip().split('http://')[0] s.detail_item = 'http://virtualo.pl' + id.strip().split('http://')[0]
s.formats = ', '.join(formats).upper() s.formats = ', '.join(formats).upper()
s.drm = SearchResult.DRM_UNLOCKED if nodrm else SearchResult.DRM_UNKNOWN s.drm = SearchResult.DRM_UNLOCKED if nodrm else SearchResult.DRM_LOCKED
yield s yield s

View File

@ -1,14 +1,15 @@
# -*- coding: utf-8 -*- # -*- coding: utf-8 -*-
from __future__ import (unicode_literals, division, absolute_import, print_function) from __future__ import (unicode_literals, division, absolute_import, print_function)
store_version = 1 # Needed for dynamic plugin loading store_version = 2 # Needed for dynamic plugin loading
__license__ = 'GPL 3' __license__ = 'GPL 3'
__copyright__ = '2011-2012, Tomasz Długosz <tomek3d@gmail.com>' __copyright__ = '2011-2013, Tomasz Długosz <tomek3d@gmail.com>'
__docformat__ = 'restructuredtext en' __docformat__ = 'restructuredtext en'
import re import re
import urllib import urllib
from base64 import b64encode
from contextlib import closing from contextlib import closing
from lxml import html from lxml import html
@ -25,17 +26,19 @@ from calibre.gui2.store.web_store_dialog import WebStoreDialog
class WoblinkStore(BasicStoreConfig, StorePlugin): class WoblinkStore(BasicStoreConfig, StorePlugin):
def open(self, parent=None, detail_item=None, external=False): def open(self, parent=None, detail_item=None, external=False):
aff_root = 'https://www.a4b-tracking.com/pl/stat-click-text-link/16/58/'
url = 'http://woblink.com/publication' url = 'http://woblink.com/publication'
aff_url = aff_root + str(b64encode(url))
detail_url = None detail_url = None
if detail_item: if detail_item:
detail_url = 'http://woblink.com' + detail_item detail_url = aff_root + str(b64encode('http://woblink.com' + detail_item))
if external or self.config.get('open_external', False): if external or self.config.get('open_external', False):
open_url(QUrl(url_slash_cleaner(detail_url if detail_url else url))) open_url(QUrl(url_slash_cleaner(detail_url if detail_url else aff_url)))
else: else:
d = WebStoreDialog(self.gui, url, parent, detail_url) d = WebStoreDialog(self.gui, url, parent, detail_url if detail_url else aff_url)
d.setWindowTitle(self.name) d.setWindowTitle(self.name)
d.set_tags(self.config.get('tags', '')) d.set_tags(self.config.get('tags', ''))
d.exec_() d.exec_()

View File

@ -559,11 +559,11 @@ class TOCView(QWidget): # {{{
b.setToolTip(_('Remove all selected entries')) b.setToolTip(_('Remove all selected entries'))
b.clicked.connect(self.del_items) b.clicked.connect(self.del_items)
self.left_button = b = QToolButton(self) self.right_button = b = QToolButton(self)
b.setIcon(QIcon(I('forward.png'))) b.setIcon(QIcon(I('forward.png')))
b.setIconSize(QSize(ICON_SIZE, ICON_SIZE)) b.setIconSize(QSize(ICON_SIZE, ICON_SIZE))
l.addWidget(b, 4, 3) l.addWidget(b, 4, 3)
b.setToolTip(_('Unindent the current entry [Ctrl+Left]')) b.setToolTip(_('Indent the current entry [Ctrl+Right]'))
b.clicked.connect(self.tocw.move_right) b.clicked.connect(self.tocw.move_right)
self.down_button = b = QToolButton(self) self.down_button = b = QToolButton(self)

View File

@ -54,7 +54,7 @@ def get_parser(usage):
def get_db(dbpath, options): def get_db(dbpath, options):
global do_notify global do_notify
if options.library_path is not None: if options.library_path is not None:
dbpath = options.library_path dbpath = os.path.expanduser(options.library_path)
if dbpath is None: if dbpath is None:
raise ValueError('No saved library path, either run the GUI or use the' raise ValueError('No saved library path, either run the GUI or use the'
' --with-library option') ' --with-library option')
@ -88,7 +88,7 @@ def do_list(db, fields, afields, sort_by, ascending, search_text, line_width, se
for f in data: for f in data:
fmts = [x for x in f['formats'] if x is not None] fmts = [x for x in f['formats'] if x is not None]
f['formats'] = u'[%s]'%u','.join(fmts) f['formats'] = u'[%s]'%u','.join(fmts)
widths = list(map(lambda x : 0, fields)) widths = list(map(lambda x: 0, fields))
for record in data: for record in data:
for f in record.keys(): for f in record.keys():
if hasattr(record[f], 'isoformat'): if hasattr(record[f], 'isoformat'):
@ -164,7 +164,8 @@ List the books available in the calibre database.
parser.add_option('--ascending', default=False, action='store_true', parser.add_option('--ascending', default=False, action='store_true',
help=_('Sort results in ascending order')) help=_('Sort results in ascending order'))
parser.add_option('-s', '--search', default=None, parser.add_option('-s', '--search', default=None,
help=_('Filter the results by the search query. For the format of the search query, please see the search related documentation in the User Manual. Default is to do no filtering.')) help=_('Filter the results by the search query. For the format of the search query,'
' please see the search related documentation in the User Manual. Default is to do no filtering.'))
parser.add_option('-w', '--line-width', default=-1, type=int, parser.add_option('-w', '--line-width', default=-1, type=int,
help=_('The maximum width of a single line in the output. Defaults to detecting screen size.')) help=_('The maximum width of a single line in the output. Defaults to detecting screen size.'))
parser.add_option('--separator', default=' ', help=_('The string used to separate fields. Default is a space.')) parser.add_option('--separator', default=' ', help=_('The string used to separate fields. Default is a space.'))
@ -244,7 +245,8 @@ def do_add(db, paths, one_book_per_directory, recurse, add_duplicates, otitle,
mi.authors = [_('Unknown')] mi.authors = [_('Unknown')]
for x in ('title', 'authors', 'isbn', 'tags', 'series'): for x in ('title', 'authors', 'isbn', 'tags', 'series'):
val = locals()['o'+x] val = locals()['o'+x]
if val: setattr(mi, x, val) if val:
setattr(mi, x, val)
if oseries: if oseries:
mi.series_index = oseries_index mi.series_index = oseries_index
if ocover: if ocover:
@ -425,18 +427,26 @@ def command_remove(args, dbpath):
return 0 return 0
def do_add_format(db, id, fmt, path): def do_add_format(db, id, fmt, path, opts):
db.add_format_with_hooks(id, fmt.upper(), path, index_is_id=True) done = db.add_format_with_hooks(id, fmt.upper(), path, index_is_id=True,
send_message() replace=opts.replace)
if not done and not opts.replace:
prints(_('A %s file already exists for book: %d, not replacing')%(fmt.upper(), id))
else:
send_message()
def add_format_option_parser(): def add_format_option_parser():
return get_parser(_( parser = get_parser(_(
'''\ '''\
%prog add_format [options] id ebook_file %prog add_format [options] id ebook_file
Add the ebook in ebook_file to the available formats for the logical book identified \ Add the ebook in ebook_file to the available formats for the logical book identified \
by id. You can get id by using the list command. If the format already exists, it is replaced. by id. You can get id by using the list command. If the format already exists, \
it is replaced, unless the do not replace option is specified.\
''')) '''))
parser.add_option('--dont-replace', dest='replace', default=True, action='store_false',
help=_('Do not replace the format if it already exists'))
return parser
def command_add_format(args, dbpath): def command_add_format(args, dbpath):
@ -451,7 +461,7 @@ def command_add_format(args, dbpath):
id, path, fmt = int(args[1]), args[2], os.path.splitext(args[2])[-1] id, path, fmt = int(args[1]), args[2], os.path.splitext(args[2])[-1]
if not fmt: if not fmt:
print _('ebook file must have an extension') print _('ebook file must have an extension')
do_add_format(get_db(dbpath, opts), id, fmt[1:], path) do_add_format(get_db(dbpath, opts), id, fmt[1:], path, opts)
return 0 return 0
def do_remove_format(db, id, fmt): def do_remove_format(db, id, fmt):
@ -791,7 +801,7 @@ def catalog_option_parser(args):
if not file_extension in available_catalog_formats(): if not file_extension in available_catalog_formats():
print_help(parser, log) print_help(parser, log)
log.error("No catalog plugin available for extension '%s'.\n" % file_extension + log.error("No catalog plugin available for extension '%s'.\n" % file_extension +
"Catalog plugins available for %s\n" % ', '.join(available_catalog_formats()) ) "Catalog plugins available for %s\n" % ', '.join(available_catalog_formats()))
raise SystemExit(1) raise SystemExit(1)
return output, file_extension return output, file_extension
@ -1214,7 +1224,8 @@ def command_restore_database(args, dbpath):
dbpath = dbpath.decode(preferred_encoding) dbpath = dbpath.decode(preferred_encoding)
class Progress(object): class Progress(object):
def __init__(self): self.total = 1 def __init__(self):
self.total = 1
def __call__(self, msg, step): def __call__(self, msg, step):
if msg is None: if msg is None:
@ -1308,7 +1319,7 @@ def command_list_categories(args, dbpath):
from calibre.utils.terminal import geometry, ColoredStream from calibre.utils.terminal import geometry, ColoredStream
separator = ' ' separator = ' '
widths = list(map(lambda x : 0, fields)) widths = list(map(lambda x: 0, fields))
for i in data: for i in data:
for j, field in enumerate(fields): for j, field in enumerate(fields):
widths[j] = max(widths[j], max(len(field), len(unicode(i[field])))) widths[j] = max(widths[j], max(len(field), len(unicode(i[field]))))

View File

@ -205,7 +205,7 @@ class LibraryDatabase2(LibraryDatabase, SchemaUpgrade, CustomColumns):
return row[loc] return row[loc]
def initialize_dynamic(self): def initialize_dynamic(self):
self.field_metadata = FieldMetadata() #Ensure we start with a clean copy self.field_metadata = FieldMetadata() # Ensure we start with a clean copy
self.prefs = DBPrefs(self) self.prefs = DBPrefs(self)
defs = self.prefs.defaults defs = self.prefs.defaults
defs['gui_restriction'] = defs['cs_restriction'] = '' defs['gui_restriction'] = defs['cs_restriction'] = ''
@ -352,7 +352,6 @@ class LibraryDatabase2(LibraryDatabase, SchemaUpgrade, CustomColumns):
'''.format(_('News'))) '''.format(_('News')))
self.conn.commit() self.conn.commit()
CustomColumns.__init__(self) CustomColumns.__init__(self)
template = '''\ template = '''\
(SELECT {query} FROM books_{table}_link AS link INNER JOIN (SELECT {query} FROM books_{table}_link AS link INNER JOIN
@ -444,7 +443,7 @@ class LibraryDatabase2(LibraryDatabase, SchemaUpgrade, CustomColumns):
# Assumption is that someone else will fix them if they change. # Assumption is that someone else will fix them if they change.
self.field_metadata.remove_dynamic_categories() self.field_metadata.remove_dynamic_categories()
for user_cat in sorted(self.prefs.get('user_categories', {}).keys(), key=sort_key): for user_cat in sorted(self.prefs.get('user_categories', {}).keys(), key=sort_key):
cat_name = '@' + user_cat # add the '@' to avoid name collision cat_name = '@' + user_cat # add the '@' to avoid name collision
self.field_metadata.add_user_category(label=cat_name, name=user_cat) self.field_metadata.add_user_category(label=cat_name, name=user_cat)
# add grouped search term user categories # add grouped search term user categories
@ -596,7 +595,7 @@ class LibraryDatabase2(LibraryDatabase, SchemaUpgrade, CustomColumns):
current title and author. If there was a previous directory, its contents current title and author. If there was a previous directory, its contents
are copied and it is deleted. are copied and it is deleted.
''' '''
id = index if index_is_id else self.id(index) id = index if index_is_id else self.id(index)
path = self.construct_path_name(id) path = self.construct_path_name(id)
current_path = self.path(id, index_is_id=True).replace(os.sep, '/') current_path = self.path(id, index_is_id=True).replace(os.sep, '/')
formats = self.formats(id, index_is_id=True) formats = self.formats(id, index_is_id=True)
@ -620,7 +619,7 @@ class LibraryDatabase2(LibraryDatabase, SchemaUpgrade, CustomColumns):
if not os.path.exists(tpath): if not os.path.exists(tpath):
os.makedirs(tpath) os.makedirs(tpath)
if source_ok: # Migrate existing files if source_ok: # Migrate existing files
self.copy_cover_to(id, os.path.join(tpath, 'cover.jpg'), self.copy_cover_to(id, os.path.join(tpath, 'cover.jpg'),
index_is_id=True, windows_atomic_move=wam, index_is_id=True, windows_atomic_move=wam,
use_hardlink=True) use_hardlink=True)
@ -668,7 +667,7 @@ class LibraryDatabase2(LibraryDatabase, SchemaUpgrade, CustomColumns):
os.rename(os.path.join(curpath, oldseg), os.rename(os.path.join(curpath, oldseg),
os.path.join(curpath, newseg)) os.path.join(curpath, newseg))
except: except:
break # Fail silently since nothing catastrophic has happened break # Fail silently since nothing catastrophic has happened
curpath = os.path.join(curpath, newseg) curpath = os.path.join(curpath, newseg)
def add_listener(self, listener): def add_listener(self, listener):
@ -727,7 +726,7 @@ class LibraryDatabase2(LibraryDatabase, SchemaUpgrade, CustomColumns):
return ret return ret
def cover_last_modified(self, index, index_is_id=False): def cover_last_modified(self, index, index_is_id=False):
id = index if index_is_id else self.id(index) id = index if index_is_id else self.id(index)
path = os.path.join(self.library_path, self.path(id, index_is_id=True), 'cover.jpg') path = os.path.join(self.library_path, self.path(id, index_is_id=True), 'cover.jpg')
try: try:
return utcfromtimestamp(os.stat(path).st_mtime) return utcfromtimestamp(os.stat(path).st_mtime)
@ -1074,8 +1073,8 @@ class LibraryDatabase2(LibraryDatabase, SchemaUpgrade, CustomColumns):
identical_book_ids = set([]) identical_book_ids = set([])
if mi.authors: if mi.authors:
try: try:
quathors = mi.authors[:10] # Too many authors causes parsing of quathors = mi.authors[:10] # Too many authors causes parsing of
# the search expression to fail # the search expression to fail
query = u' and '.join([u'author:"=%s"'%(a.replace('"', '')) for a in query = u' and '.join([u'author:"=%s"'%(a.replace('"', '')) for a in
quathors]) quathors])
qauthors = mi.authors[10:] qauthors = mi.authors[10:]
@ -1307,7 +1306,7 @@ class LibraryDatabase2(LibraryDatabase, SchemaUpgrade, CustomColumns):
return fmt_path return fmt_path
try: try:
candidates = glob.glob(os.path.join(path, '*'+format)) candidates = glob.glob(os.path.join(path, '*'+format))
except: # If path contains strange characters this throws an exc except: # If path contains strange characters this throws an exc
candidates = [] candidates = []
if format and candidates and os.path.exists(candidates[0]): if format and candidates and os.path.exists(candidates[0]):
try: try:
@ -1350,7 +1349,7 @@ class LibraryDatabase2(LibraryDatabase, SchemaUpgrade, CustomColumns):
if path != dest: if path != dest:
os.rename(path, dest) os.rename(path, dest)
except: except:
pass # Nothing too catastrophic happened, the cases mismatch, that's all pass # Nothing too catastrophic happened, the cases mismatch, that's all
else: else:
windows_atomic_move.copy_path_to(path, dest) windows_atomic_move.copy_path_to(path, dest)
else: else:
@ -1366,7 +1365,7 @@ class LibraryDatabase2(LibraryDatabase, SchemaUpgrade, CustomColumns):
try: try:
os.rename(path, dest) os.rename(path, dest)
except: except:
pass # Nothing too catastrophic happened, the cases mismatch, that's all pass # Nothing too catastrophic happened, the cases mismatch, that's all
else: else:
if use_hardlink: if use_hardlink:
try: try:
@ -1476,12 +1475,12 @@ class LibraryDatabase2(LibraryDatabase, SchemaUpgrade, CustomColumns):
return ret return ret
def add_format_with_hooks(self, index, format, fpath, index_is_id=False, def add_format_with_hooks(self, index, format, fpath, index_is_id=False,
path=None, notify=True): path=None, notify=True, replace=True):
npath = self.run_import_plugins(fpath, format) npath = self.run_import_plugins(fpath, format)
format = os.path.splitext(npath)[-1].lower().replace('.', '').upper() format = os.path.splitext(npath)[-1].lower().replace('.', '').upper()
stream = lopen(npath, 'rb') stream = lopen(npath, 'rb')
format = check_ebook_format(stream, format) format = check_ebook_format(stream, format)
retval = self.add_format(index, format, stream, retval = self.add_format(index, format, stream, replace=replace,
index_is_id=index_is_id, path=path, notify=notify) index_is_id=index_is_id, path=path, notify=notify)
run_plugins_on_postimport(self, id, format) run_plugins_on_postimport(self, id, format)
return retval return retval
@ -1489,7 +1488,8 @@ class LibraryDatabase2(LibraryDatabase, SchemaUpgrade, CustomColumns):
def add_format(self, index, format, stream, index_is_id=False, path=None, def add_format(self, index, format, stream, index_is_id=False, path=None,
notify=True, replace=True, copy_function=None): notify=True, replace=True, copy_function=None):
id = index if index_is_id else self.id(index) id = index if index_is_id else self.id(index)
if not format: format = '' if not format:
format = ''
self.format_metadata_cache[id].pop(format.upper(), None) self.format_metadata_cache[id].pop(format.upper(), None)
name = self.format_filename_cache[id].get(format.upper(), None) name = self.format_filename_cache[id].get(format.upper(), None)
if path is None: if path is None:
@ -1541,6 +1541,14 @@ class LibraryDatabase2(LibraryDatabase, SchemaUpgrade, CustomColumns):
opath = self.format_abspath(book_id, nfmt, index_is_id=True) opath = self.format_abspath(book_id, nfmt, index_is_id=True)
return fmt if opath is None else nfmt return fmt if opath is None else nfmt
def restore_original_format(self, book_id, original_fmt, notify=True):
opath = self.format_abspath(book_id, original_fmt, index_is_id=True)
if opath is not None:
fmt = original_fmt.partition('_')[2]
with lopen(opath, 'rb') as f:
self.add_format(book_id, fmt, f, index_is_id=True, notify=False)
self.remove_format(book_id, original_fmt, index_is_id=True, notify=notify)
def delete_book(self, id, notify=True, commit=True, permanent=False, def delete_book(self, id, notify=True, commit=True, permanent=False,
do_clean=True): do_clean=True):
''' '''
@ -1568,7 +1576,8 @@ class LibraryDatabase2(LibraryDatabase, SchemaUpgrade, CustomColumns):
def remove_format(self, index, format, index_is_id=False, notify=True, def remove_format(self, index, format, index_is_id=False, notify=True,
commit=True, db_only=False): commit=True, db_only=False):
id = index if index_is_id else self.id(index) id = index if index_is_id else self.id(index)
if not format: format = '' if not format:
format = ''
self.format_metadata_cache[id].pop(format.upper(), None) self.format_metadata_cache[id].pop(format.upper(), None)
name = self.format_filename_cache[id].get(format.upper(), None) name = self.format_filename_cache[id].get(format.upper(), None)
if name: if name:
@ -1737,12 +1746,12 @@ class LibraryDatabase2(LibraryDatabase, SchemaUpgrade, CustomColumns):
# Get the ids for the item values # Get the ids for the item values
if not cat['is_custom']: if not cat['is_custom']:
funcs = { funcs = {
'authors' : self.get_authors_with_ids, 'authors': self.get_authors_with_ids,
'series' : self.get_series_with_ids, 'series': self.get_series_with_ids,
'publisher': self.get_publishers_with_ids, 'publisher': self.get_publishers_with_ids,
'tags' : self.get_tags_with_ids, 'tags': self.get_tags_with_ids,
'languages': self.get_languages_with_ids, 'languages': self.get_languages_with_ids,
'rating' : self.get_ratings_with_ids, 'rating': self.get_ratings_with_ids,
} }
func = funcs.get(category, None) func = funcs.get(category, None)
if func: if func:
@ -1825,7 +1834,7 @@ class LibraryDatabase2(LibraryDatabase, SchemaUpgrade, CustomColumns):
item.rc += 1 item.rc += 1
continue continue
try: try:
(item_id, sort_val) = tid_cat[val] # let exceptions fly (item_id, sort_val) = tid_cat[val] # let exceptions fly
item = tcats_cat.get(val, None) item = tcats_cat.get(val, None)
if not item: if not item:
item = tag_class(val, sort_val) item = tag_class(val, sort_val)
@ -1847,7 +1856,7 @@ class LibraryDatabase2(LibraryDatabase, SchemaUpgrade, CustomColumns):
tid_cat[val] = (val, val) tid_cat[val] = (val, val)
for val in vals: for val in vals:
try: try:
(item_id, sort_val) = tid_cat[val] # let exceptions fly (item_id, sort_val) = tid_cat[val] # let exceptions fly
item = tcats_cat.get(val, None) item = tcats_cat.get(val, None)
if not item: if not item:
item = tag_class(val, sort_val) item = tag_class(val, sort_val)
@ -1915,7 +1924,7 @@ class LibraryDatabase2(LibraryDatabase, SchemaUpgrade, CustomColumns):
# in the main Tag loop. Saves a few % # in the main Tag loop. Saves a few %
if datatype == 'rating': if datatype == 'rating':
formatter = (lambda x:u'\u2605'*int(x/2)) formatter = (lambda x:u'\u2605'*int(x/2))
avgr = lambda x : x.n avgr = lambda x: x.n
# eliminate the zero ratings line as well as count == 0 # eliminate the zero ratings line as well as count == 0
items = [v for v in tcategories[category].values() if v.c > 0 and v.n != 0] items = [v for v in tcategories[category].values() if v.c > 0 and v.n != 0]
elif category == 'authors': elif category == 'authors':
@ -1932,7 +1941,7 @@ class LibraryDatabase2(LibraryDatabase, SchemaUpgrade, CustomColumns):
# sort the list # sort the list
if sort == 'name': if sort == 'name':
kf = lambda x :sort_key(x.s) kf = lambda x:sort_key(x.s)
reverse=False reverse=False
elif sort == 'popularity': elif sort == 'popularity':
kf = lambda x: x.c kf = lambda x: x.c
@ -1997,9 +2006,9 @@ class LibraryDatabase2(LibraryDatabase, SchemaUpgrade, CustomColumns):
if sort == 'popularity': if sort == 'popularity':
categories['formats'].sort(key=lambda x: x.count, reverse=True) categories['formats'].sort(key=lambda x: x.count, reverse=True)
else: # no ratings exist to sort on else: # no ratings exist to sort on
# No need for ICU here. # No need for ICU here.
categories['formats'].sort(key = lambda x:x.name) categories['formats'].sort(key=lambda x:x.name)
# Now do identifiers. This works like formats # Now do identifiers. This works like formats
categories['identifiers'] = [] categories['identifiers'] = []
@ -2026,9 +2035,9 @@ class LibraryDatabase2(LibraryDatabase, SchemaUpgrade, CustomColumns):
if sort == 'popularity': if sort == 'popularity':
categories['identifiers'].sort(key=lambda x: x.count, reverse=True) categories['identifiers'].sort(key=lambda x: x.count, reverse=True)
else: # no ratings exist to sort on else: # no ratings exist to sort on
# No need for ICU here. # No need for ICU here.
categories['identifiers'].sort(key = lambda x:x.name) categories['identifiers'].sort(key=lambda x:x.name)
#### Now do the user-defined categories. #### #### Now do the user-defined categories. ####
user_categories = dict.copy(self.clean_user_categories()) user_categories = dict.copy(self.clean_user_categories())
@ -2075,7 +2084,7 @@ class LibraryDatabase2(LibraryDatabase, SchemaUpgrade, CustomColumns):
else: else:
items.append(taglist[label][n]) items.append(taglist[label][n])
# else: do nothing, to not include nodes w zero counts # else: do nothing, to not include nodes w zero counts
cat_name = '@' + user_cat # add the '@' to avoid name collision cat_name = '@' + user_cat # add the '@' to avoid name collision
# Not a problem if we accumulate entries in the icon map # Not a problem if we accumulate entries in the icon map
if icon_map is not None: if icon_map is not None:
icon_map[cat_name] = icon_map['user:'] icon_map[cat_name] = icon_map['user:']
@ -2323,11 +2332,10 @@ class LibraryDatabase2(LibraryDatabase, SchemaUpgrade, CustomColumns):
elif mi_idents: elif mi_idents:
identifiers = self.get_identifiers(id, index_is_id=True) identifiers = self.get_identifiers(id, index_is_id=True)
for key, val in mi_idents.iteritems(): for key, val in mi_idents.iteritems():
if val and val.strip(): # Don't delete an existing identifier if val and val.strip(): # Don't delete an existing identifier
identifiers[icu_lower(key)] = val identifiers[icu_lower(key)] = val
self.set_identifiers(id, identifiers, notify=False, commit=False) self.set_identifiers(id, identifiers, notify=False, commit=False)
user_mi = mi.get_all_user_metadata(make_copy=False) user_mi = mi.get_all_user_metadata(make_copy=False)
for key in user_mi.iterkeys(): for key in user_mi.iterkeys():
if key in self.field_metadata and \ if key in self.field_metadata and \
@ -2447,7 +2455,7 @@ class LibraryDatabase2(LibraryDatabase, SchemaUpgrade, CustomColumns):
try: try:
self.conn.execute('''INSERT INTO books_authors_link(book, author) self.conn.execute('''INSERT INTO books_authors_link(book, author)
VALUES (?,?)''', (id, aid)) VALUES (?,?)''', (id, aid))
except IntegrityError: # Sometimes books specify the same author twice in their metadata except IntegrityError: # Sometimes books specify the same author twice in their metadata
pass pass
if case_change: if case_change:
bks = self.conn.get('''SELECT book FROM books_authors_link bks = self.conn.get('''SELECT book FROM books_authors_link
@ -2606,7 +2614,6 @@ class LibraryDatabase2(LibraryDatabase, SchemaUpgrade, CustomColumns):
if notify: if notify:
self.notify('metadata', [id]) self.notify('metadata', [id])
def set_publisher(self, id, publisher, notify=True, commit=True, def set_publisher(self, id, publisher, notify=True, commit=True,
allow_case_change=False): allow_case_change=False):
self.conn.execute('DELETE FROM books_publishers_link WHERE book=?',(id,)) self.conn.execute('DELETE FROM books_publishers_link WHERE book=?',(id,))
@ -2812,7 +2819,7 @@ class LibraryDatabase2(LibraryDatabase, SchemaUpgrade, CustomColumns):
if new_id is None or old_id == new_id: if new_id is None or old_id == new_id:
new_id = old_id new_id = old_id
# New name doesn't exist. Simply change the old name # New name doesn't exist. Simply change the old name
self.conn.execute('UPDATE publishers SET name=? WHERE id=?', \ self.conn.execute('UPDATE publishers SET name=? WHERE id=?',
(new_name, old_id)) (new_name, old_id))
else: else:
# Change the link table to point at the new one # Change the link table to point at the new one
@ -2852,7 +2859,7 @@ class LibraryDatabase2(LibraryDatabase, SchemaUpgrade, CustomColumns):
self.conn.commit() self.conn.commit()
def set_sort_field_for_author(self, old_id, new_sort, commit=True, notify=False): def set_sort_field_for_author(self, old_id, new_sort, commit=True, notify=False):
self.conn.execute('UPDATE authors SET sort=? WHERE id=?', \ self.conn.execute('UPDATE authors SET sort=? WHERE id=?',
(new_sort.strip(), old_id)) (new_sort.strip(), old_id))
if commit: if commit:
self.conn.commit() self.conn.commit()
@ -2951,7 +2958,7 @@ class LibraryDatabase2(LibraryDatabase, SchemaUpgrade, CustomColumns):
@classmethod @classmethod
def cleanup_tags(cls, tags): def cleanup_tags(cls, tags):
tags = [x.strip().replace(',', ';') for x in tags if x.strip()] tags = [x.strip().replace(',', ';') for x in tags if x.strip()]
tags = [x.decode(preferred_encoding, 'replace') \ tags = [x.decode(preferred_encoding, 'replace')
if isbytestring(x) else x for x in tags] if isbytestring(x) else x for x in tags]
tags = [u' '.join(x.split()) for x in tags] tags = [u' '.join(x.split()) for x in tags]
ans, seen = [], set([]) ans, seen = [], set([])
@ -3352,10 +3359,9 @@ class LibraryDatabase2(LibraryDatabase, SchemaUpgrade, CustomColumns):
self.add_format(db_id, format, stream, index_is_id=True) self.add_format(db_id, format, stream, index_is_id=True)
self.conn.commit() self.conn.commit()
self.data.refresh_ids(self, [db_id]) # Needed to update format list and size self.data.refresh_ids(self, [db_id]) # Needed to update format list and size
return db_id return db_id
def add_news(self, path, arg): def add_news(self, path, arg):
from calibre.ebooks.metadata.meta import get_metadata from calibre.ebooks.metadata.meta import get_metadata
@ -3391,7 +3397,7 @@ class LibraryDatabase2(LibraryDatabase, SchemaUpgrade, CustomColumns):
if not hasattr(path, 'read'): if not hasattr(path, 'read'):
stream.close() stream.close()
self.conn.commit() self.conn.commit()
self.data.refresh_ids(self, [id]) # Needed to update format list and size self.data.refresh_ids(self, [id]) # Needed to update format list and size
return id return id
def run_import_plugins(self, path_or_stream, format): def run_import_plugins(self, path_or_stream, format):
@ -3455,7 +3461,6 @@ class LibraryDatabase2(LibraryDatabase, SchemaUpgrade, CustomColumns):
traceback.print_exc() traceback.print_exc()
return id return id
def add_books(self, paths, formats, metadata, add_duplicates=True, def add_books(self, paths, formats, metadata, add_duplicates=True,
return_ids=False): return_ids=False):
''' '''
@ -3499,7 +3504,7 @@ class LibraryDatabase2(LibraryDatabase, SchemaUpgrade, CustomColumns):
stream.close() stream.close()
postimport.append((id, format)) postimport.append((id, format))
self.conn.commit() self.conn.commit()
self.data.refresh_ids(self, ids) # Needed to update format list and size self.data.refresh_ids(self, ids) # Needed to update format list and size
for book_id, fmt in postimport: for book_id, fmt in postimport:
run_plugins_on_postimport(self, book_id, fmt) run_plugins_on_postimport(self, book_id, fmt)
if duplicates: if duplicates:
@ -3549,7 +3554,7 @@ class LibraryDatabase2(LibraryDatabase, SchemaUpgrade, CustomColumns):
# set_metadata, but probably isn't good enough # set_metadata, but probably isn't good enough
self.dirtied([id], commit=False) self.dirtied([id], commit=False)
self.conn.commit() self.conn.commit()
self.data.refresh_ids(self, [id]) # Needed to update format list and size self.data.refresh_ids(self, [id]) # Needed to update format list and size
if notify: if notify:
self.notify('add', [id]) self.notify('add', [id])
return id return id
@ -3643,7 +3648,8 @@ class LibraryDatabase2(LibraryDatabase, SchemaUpgrade, CustomColumns):
FIELDS.add('%d_index'%x) FIELDS.add('%d_index'%x)
data = [] data = []
for record in self.data: for record in self.data:
if record is None: continue if record is None:
continue
db_id = record[self.FIELD_MAP['id']] db_id = record[self.FIELD_MAP['id']]
if ids is not None and db_id not in ids: if ids is not None and db_id not in ids:
continue continue
@ -3686,8 +3692,8 @@ class LibraryDatabase2(LibraryDatabase, SchemaUpgrade, CustomColumns):
progress.setValue(0) progress.setValue(0)
progress.setLabelText(header) progress.setLabelText(header)
QCoreApplication.processEvents() QCoreApplication.processEvents()
db.conn.row_factory = lambda cursor, row : tuple(row) db.conn.row_factory = lambda cursor, row: tuple(row)
db.conn.text_factory = lambda x : unicode(x, 'utf-8', 'replace') db.conn.text_factory = lambda x: unicode(x, 'utf-8', 'replace')
books = db.conn.get('SELECT id, title, sort, timestamp, series_index, author_sort, isbn FROM books ORDER BY id ASC') books = db.conn.get('SELECT id, title, sort, timestamp, series_index, author_sort, isbn FROM books ORDER BY id ASC')
progress.setAutoReset(False) progress.setAutoReset(False)
progress.setRange(0, len(books)) progress.setRange(0, len(books))
@ -3763,7 +3769,7 @@ books_series_link feeds
continue continue
key = os.path.splitext(path)[0] key = os.path.splitext(path)[0]
if not books.has_key(key): if key not in books:
books[key] = [] books[key] = []
books[key].append(path) books[key].append(path)

View File

@ -24,7 +24,7 @@ def stop_threaded_server(server):
server.exit() server.exit()
server.thread = None server.thread = None
def create_wsgi_app(path_to_library=None, prefix=''): def create_wsgi_app(path_to_library=None, prefix='', virtual_library=None):
'WSGI entry point' 'WSGI entry point'
from calibre.library import db from calibre.library import db
cherrypy.config.update({'environment': 'embedded'}) cherrypy.config.update({'environment': 'embedded'})
@ -32,6 +32,7 @@ def create_wsgi_app(path_to_library=None, prefix=''):
parser = option_parser() parser = option_parser()
opts, args = parser.parse_args(['calibre-server']) opts, args = parser.parse_args(['calibre-server'])
opts.url_prefix = prefix opts.url_prefix = prefix
opts.restriction = virtual_library
server = LibraryServer(db, opts, wsgi=True, show_tracebacks=True) server = LibraryServer(db, opts, wsgi=True, show_tracebacks=True)
return cherrypy.Application(server, script_name=None, config=server.config) return cherrypy.Application(server, script_name=None, config=server.config)
@ -97,7 +98,6 @@ def daemonize(stdin='/dev/null', stdout='/dev/null', stderr='/dev/null'):
os.dup2(se.fileno(), sys.stderr.fileno()) os.dup2(se.fileno(), sys.stderr.fileno())
def main(args=sys.argv): def main(args=sys.argv):
from calibre.library.database2 import LibraryDatabase2 from calibre.library.database2 import LibraryDatabase2
parser = option_parser() parser = option_parser()

View File

@ -33,7 +33,7 @@ entry_points = {
'fetch-ebook-metadata = calibre.ebooks.metadata.sources.cli:main', 'fetch-ebook-metadata = calibre.ebooks.metadata.sources.cli:main',
'calibre-smtp = calibre.utils.smtp:main', 'calibre-smtp = calibre.utils.smtp:main',
], ],
'gui_scripts' : [ 'gui_scripts' : [
__appname__+' = calibre.gui2.main:main', __appname__+' = calibre.gui2.main:main',
'lrfviewer = calibre.gui2.lrf_renderer.main:main', 'lrfviewer = calibre.gui2.lrf_renderer.main:main',
'ebook-viewer = calibre.gui2.viewer.main:main', 'ebook-viewer = calibre.gui2.viewer.main:main',

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

Some files were not shown because too many files have changed in this diff Show More