This commit is contained in:
GRiker 2012-02-10 10:08:06 -07:00
commit 8fa9dae053
116 changed files with 131000 additions and 100793 deletions

View File

@ -19,6 +19,60 @@
# new recipes:
# - title:
- version: 0.8.39
date: 2012-02-10
new features:
- title: "Auto-adding: Add an option to check for duplicates when auto adding."
tickets: [926962]
- title: "Content server: Export a second record via mDNS that points to the full OPDS feed in addition to the one pointing to the Stanza feed. The new record is of type _calibre._tcp."
tickets: [929304]
- title: "Allow specifying a set of categories that are not partitioned even if they contain a large number of items in the Tag Browser. Preference is available under Look & Feel->Tag Browser"
- title: "Allow setting a URL prefix for the content server that run embedded in the calibre GUI as well."
tickets: [928905]
- title: "Allow output of identifiers data in CSV/XML/BiBTeX catalogs"
tickets: [927737]
- title: "Driver for Motorola Droid XT910, Nokia E71 and HTC EVO 3D."
tickets: [928202, 927818, 929400]
- title: "Cut down the time taken to launch worker processes by 40%"
- title: "You can now configure the calibre settings for the currently connected device by right clicking on the device icon in the toolbar, instead of having to go through Preferences->Plugins"
bug fixes:
- title: "Auto-adding: Do not add incomplete files when files are downloaded directly into the auto add folder."
tickets: [926578]
- title: "When running multiple delete from device jobs, fix the device view sometimes marking the wrong books as being deleted, after the first delete job completes."
tickets: [927972]
- title: "MOBI Input: Handle files that have spurious closing </body> and/or </html> tags in their markup."
tickets: [925833]
- title: "RTF Input: Strip out false color specifications, as they cause artifacts when converted to MOBI"
improved recipes:
- Updated Postmedia publications
- Foreign Affairs
- Read It Later
- Microwave Journal
- taggeschau.de
new recipes:
- title: Vancouver Province and Windsor Star
author: Nick Redding
- title: Onda Rock
author: faber1971
- title: Il Manifesto
author: Giacomo Lacava
- version: 0.8.38
date: 2012-02-03

View File

@ -1,4 +1,5 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
__license__ = 'GPL v3'
@ -6,45 +7,76 @@ __license__ = 'GPL v3'
www.canada.com
'''
from calibre.web.feeds.recipes import BasicNewsRecipe
import re
from calibre.web.feeds.news import BasicNewsRecipe
from calibre.ebooks.BeautifulSoup import BeautifulStoneSoup
class CanWestPaper(BasicNewsRecipe):
# un-comment the following three lines for the Calgary Herald
# un-comment the following four lines for the Victoria Times Colonist
## title = u'Victoria Times Colonist'
## url_prefix = 'http://www.timescolonist.com'
## description = u'News from Victoria, BC'
## fp_tag = 'CAN_TC'
# un-comment the following four lines for the Vancouver Province
## title = u'Vancouver Province'
## url_prefix = 'http://www.theprovince.com'
## description = u'News from Vancouver, BC'
## fp_tag = 'CAN_VP'
# un-comment the following four lines for the Vancouver Sun
## title = u'Vancouver Sun'
## url_prefix = 'http://www.vancouversun.com'
## description = u'News from Vancouver, BC'
## fp_tag = 'CAN_VS'
# un-comment the following four lines for the Edmonton Journal
## title = u'Edmonton Journal'
## url_prefix = 'http://www.edmontonjournal.com'
## description = u'News from Edmonton, AB'
## fp_tag = 'CAN_EJ'
# un-comment the following four lines for the Calgary Herald
title = u'Calgary Herald'
url_prefix = 'http://www.calgaryherald.com'
description = u'News from Calgary, AB'
fp_tag = 'CAN_CH'
# un-comment the following three lines for the Regina Leader-Post
#title = u'Regina Leader-Post'
#url_prefix = 'http://www.leaderpost.com'
#description = u'News from Regina, SK'
# un-comment the following four lines for the Regina Leader-Post
## title = u'Regina Leader-Post'
## url_prefix = 'http://www.leaderpost.com'
## description = u'News from Regina, SK'
## fp_tag = ''
# un-comment the following three lines for the Saskatoon Star-Phoenix
#title = u'Saskatoon Star-Phoenix'
#url_prefix = 'http://www.thestarphoenix.com'
#description = u'News from Saskatoon, SK'
# un-comment the following four lines for the Saskatoon Star-Phoenix
## title = u'Saskatoon Star-Phoenix'
## url_prefix = 'http://www.thestarphoenix.com'
## description = u'News from Saskatoon, SK'
## fp_tag = ''
# un-comment the following three lines for the Windsor Star
#title = u'Windsor Star'
#url_prefix = 'http://www.windsorstar.com'
#description = u'News from Windsor, ON'
# un-comment the following four lines for the Windsor Star
## title = u'Windsor Star'
## url_prefix = 'http://www.windsorstar.com'
## description = u'News from Windsor, ON'
## fp_tag = 'CAN_'
# un-comment the following three lines for the Ottawa Citizen
#title = u'Ottawa Citizen'
#url_prefix = 'http://www.ottawacitizen.com'
#description = u'News from Ottawa, ON'
# un-comment the following four lines for the Ottawa Citizen
## title = u'Ottawa Citizen'
## url_prefix = 'http://www.ottawacitizen.com'
## description = u'News from Ottawa, ON'
## fp_tag = 'CAN_OC'
# un-comment the following three lines for the Montreal Gazette
#title = u'Montreal Gazette'
#url_prefix = 'http://www.montrealgazette.com'
#description = u'News from Montreal, QC'
# un-comment the following four lines for the Montreal Gazette
## title = u'Montreal Gazette'
## url_prefix = 'http://www.montrealgazette.com'
## description = u'News from Montreal, QC'
## fp_tag = 'CAN_MG'
language = 'en_CA'
__author__ = 'Nick Redding'
encoding = 'latin1'
no_stylesheets = True
timefmt = ' [%b %d]'
extra_css = '''
@ -64,14 +96,80 @@ class CanWestPaper(BasicNewsRecipe):
dict(name='div', attrs={'class':'rule_grey_solid'}),
dict(name='li', attrs={'class':'print'}),dict(name='li', attrs={'class':'share'}),dict(name='ul', attrs={'class':'bullet'})]
def preprocess_html(self,soup):
#delete iempty id attributes--they screw up the TOC for unknow reasons
divtags = soup.findAll('div',attrs={'id':''})
if divtags:
for div in divtags:
del(div['id'])
def get_cover_url(self):
from datetime import timedelta, date
if self.fp_tag=='':
return None
cover = 'http://webmedia.newseum.org/newseum-multimedia/dfp/jpg'+str(date.today().day)+'/lg/'+self.fp_tag+'.jpg'
br = BasicNewsRecipe.get_browser()
daysback=1
try:
br.open(cover)
except:
while daysback<7:
cover = 'http://webmedia.newseum.org/newseum-multimedia/dfp/jpg'+str((date.today() - timedelta(days=daysback)).day)+'/lg/'+self.fp_tag+'.jpg'
br = BasicNewsRecipe.get_browser()
try:
br.open(cover)
except:
daysback = daysback+1
continue
break
if daysback==7:
self.log("\nCover unavailable")
cover = None
return cover
def fixChars(self,string):
# Replace lsquo (\x91)
fixed = re.sub("\x91","",string)
# Replace rsquo (\x92)
fixed = re.sub("\x92","",fixed)
# Replace ldquo (\x93)
fixed = re.sub("\x93","“",fixed)
# Replace rdquo (\x94)
fixed = re.sub("\x94","”",fixed)
# Replace ndash (\x96)
fixed = re.sub("\x96","",fixed)
# Replace mdash (\x97)
fixed = re.sub("\x97","—",fixed)
fixed = re.sub("&#x2019;","",fixed)
return fixed
def massageNCXText(self, description):
# Kindle TOC descriptions won't render certain characters
if description:
massaged = unicode(BeautifulStoneSoup(description, convertEntities=BeautifulStoneSoup.HTML_ENTITIES))
# Replace '&' with '&'
massaged = re.sub("&","&", massaged)
return self.fixChars(massaged)
else:
return description
def populate_article_metadata(self, article, soup, first):
if first:
picdiv = soup.find('body').find('img')
if picdiv is not None:
self.add_toc_thumbnail(article,re.sub(r'links\\link\d+\\','',picdiv['src']))
xtitle = article.text_summary.strip()
if len(xtitle) == 0:
desc = soup.find('meta',attrs={'property':'og:description'})
if desc is not None:
article.summary = article.text_summary = desc['content']
def strip_anchors(self,soup):
paras = soup.findAll(True)
for para in paras:
aTags = para.findAll('a')
for a in aTags:
if a.img is None:
a.replaceWith(a.renderContents().decode('cp1252','replace'))
return soup
def preprocess_html(self, soup):
return self.strip_anchors(soup)
def parse_index(self):
soup = self.index_to_soup(self.url_prefix+'/news/todays-paper/index.html')
@ -98,9 +196,7 @@ class CanWestPaper(BasicNewsRecipe):
atag = h1tag.find('a',href=True)
if not atag:
continue
url = atag['href']
if not url.startswith('http:'):
url = self.url_prefix+'/news/todays-paper/'+atag['href']
url = self.url_prefix+'/news/todays-paper/'+atag['href']
#self.log("Section %s" % key)
#self.log("url %s" % url)
title = self.tag_to_string(atag,False)

View File

@ -1,4 +1,5 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
__license__ = 'GPL v3'
@ -6,45 +7,72 @@ __license__ = 'GPL v3'
www.canada.com
'''
from calibre.web.feeds.recipes import BasicNewsRecipe
import re
from calibre.web.feeds.news import BasicNewsRecipe
from calibre.ebooks.BeautifulSoup import BeautifulStoneSoup
class CanWestPaper(BasicNewsRecipe):
# un-comment the following three lines for the Edmonton Journal
# un-comment the following four lines for the Victoria Times Colonist
## title = u'Victoria Times Colonist'
## url_prefix = 'http://www.timescolonist.com'
## description = u'News from Victoria, BC'
## fp_tag = 'CAN_TC'
# un-comment the following four lines for the Vancouver Province
## title = u'Vancouver Province'
## url_prefix = 'http://www.theprovince.com'
## description = u'News from Vancouver, BC'
## fp_tag = 'CAN_VP'
# un-comment the following four lines for the Vancouver Sun
## title = u'Vancouver Sun'
## url_prefix = 'http://www.vancouversun.com'
## description = u'News from Vancouver, BC'
## fp_tag = 'CAN_VS'
# un-comment the following four lines for the Edmonton Journal
title = u'Edmonton Journal'
url_prefix = 'http://www.edmontonjournal.com'
description = u'News from Edmonton, AB'
fp_tag = 'CAN_EJ'
# un-comment the following three lines for the Calgary Herald
#title = u'Calgary Herald'
#url_prefix = 'http://www.calgaryherald.com'
#description = u'News from Calgary, AB'
# un-comment the following four lines for the Calgary Herald
## title = u'Calgary Herald'
## url_prefix = 'http://www.calgaryherald.com'
## description = u'News from Calgary, AB'
## fp_tag = 'CAN_CH'
# un-comment the following three lines for the Regina Leader-Post
#title = u'Regina Leader-Post'
#url_prefix = 'http://www.leaderpost.com'
#description = u'News from Regina, SK'
# un-comment the following four lines for the Regina Leader-Post
## title = u'Regina Leader-Post'
## url_prefix = 'http://www.leaderpost.com'
## description = u'News from Regina, SK'
## fp_tag = ''
# un-comment the following three lines for the Saskatoon Star-Phoenix
#title = u'Saskatoon Star-Phoenix'
#url_prefix = 'http://www.thestarphoenix.com'
#description = u'News from Saskatoon, SK'
# un-comment the following four lines for the Saskatoon Star-Phoenix
## title = u'Saskatoon Star-Phoenix'
## url_prefix = 'http://www.thestarphoenix.com'
## description = u'News from Saskatoon, SK'
## fp_tag = ''
# un-comment the following three lines for the Windsor Star
#title = u'Windsor Star'
#url_prefix = 'http://www.windsorstar.com'
#description = u'News from Windsor, ON'
# un-comment the following four lines for the Windsor Star
## title = u'Windsor Star'
## url_prefix = 'http://www.windsorstar.com'
## description = u'News from Windsor, ON'
## fp_tag = 'CAN_'
# un-comment the following three lines for the Ottawa Citizen
#title = u'Ottawa Citizen'
#url_prefix = 'http://www.ottawacitizen.com'
#description = u'News from Ottawa, ON'
# un-comment the following four lines for the Ottawa Citizen
## title = u'Ottawa Citizen'
## url_prefix = 'http://www.ottawacitizen.com'
## description = u'News from Ottawa, ON'
## fp_tag = 'CAN_OC'
# un-comment the following three lines for the Montreal Gazette
#title = u'Montreal Gazette'
#url_prefix = 'http://www.montrealgazette.com'
#description = u'News from Montreal, QC'
# un-comment the following four lines for the Montreal Gazette
## title = u'Montreal Gazette'
## url_prefix = 'http://www.montrealgazette.com'
## description = u'News from Montreal, QC'
## fp_tag = 'CAN_MG'
language = 'en_CA'
@ -68,14 +96,80 @@ class CanWestPaper(BasicNewsRecipe):
dict(name='div', attrs={'class':'rule_grey_solid'}),
dict(name='li', attrs={'class':'print'}),dict(name='li', attrs={'class':'share'}),dict(name='ul', attrs={'class':'bullet'})]
def preprocess_html(self,soup):
#delete iempty id attributes--they screw up the TOC for unknow reasons
divtags = soup.findAll('div',attrs={'id':''})
if divtags:
for div in divtags:
del(div['id'])
def get_cover_url(self):
from datetime import timedelta, date
if self.fp_tag=='':
return None
cover = 'http://webmedia.newseum.org/newseum-multimedia/dfp/jpg'+str(date.today().day)+'/lg/'+self.fp_tag+'.jpg'
br = BasicNewsRecipe.get_browser()
daysback=1
try:
br.open(cover)
except:
while daysback<7:
cover = 'http://webmedia.newseum.org/newseum-multimedia/dfp/jpg'+str((date.today() - timedelta(days=daysback)).day)+'/lg/'+self.fp_tag+'.jpg'
br = BasicNewsRecipe.get_browser()
try:
br.open(cover)
except:
daysback = daysback+1
continue
break
if daysback==7:
self.log("\nCover unavailable")
cover = None
return cover
def fixChars(self,string):
# Replace lsquo (\x91)
fixed = re.sub("\x91","",string)
# Replace rsquo (\x92)
fixed = re.sub("\x92","",fixed)
# Replace ldquo (\x93)
fixed = re.sub("\x93","“",fixed)
# Replace rdquo (\x94)
fixed = re.sub("\x94","”",fixed)
# Replace ndash (\x96)
fixed = re.sub("\x96","",fixed)
# Replace mdash (\x97)
fixed = re.sub("\x97","—",fixed)
fixed = re.sub("&#x2019;","",fixed)
return fixed
def massageNCXText(self, description):
# Kindle TOC descriptions won't render certain characters
if description:
massaged = unicode(BeautifulStoneSoup(description, convertEntities=BeautifulStoneSoup.HTML_ENTITIES))
# Replace '&' with '&'
massaged = re.sub("&","&", massaged)
return self.fixChars(massaged)
else:
return description
def populate_article_metadata(self, article, soup, first):
if first:
picdiv = soup.find('body').find('img')
if picdiv is not None:
self.add_toc_thumbnail(article,re.sub(r'links\\link\d+\\','',picdiv['src']))
xtitle = article.text_summary.strip()
if len(xtitle) == 0:
desc = soup.find('meta',attrs={'property':'og:description'})
if desc is not None:
article.summary = article.text_summary = desc['content']
def strip_anchors(self,soup):
paras = soup.findAll(True)
for para in paras:
aTags = para.findAll('a')
for a in aTags:
if a.img is None:
a.replaceWith(a.renderContents().decode('cp1252','replace'))
return soup
def preprocess_html(self, soup):
return self.strip_anchors(soup)
def parse_index(self):
soup = self.index_to_soup(self.url_prefix+'/news/todays-paper/index.html')

View File

@ -80,7 +80,6 @@ class ForeignAffairsRecipe(BasicNewsRecipe):
tags = []
for div in content.findAll('div', attrs = {'class': re.compile(r'view-row\s+views-row-[0-9]+\s+views-row-[odd|even].*')}):
tags.append(div)
ul = content.find('ul')
for li in content.findAll('li'):
tags.append(li)

Binary file not shown.

Before

Width:  |  Height:  |  Size: 1.1 KiB

After

Width:  |  Height:  |  Size: 289 B

View File

@ -1,4 +1,5 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
__license__ = 'GPL v3'
@ -6,15 +7,72 @@ __license__ = 'GPL v3'
www.canada.com
'''
from calibre.web.feeds.recipes import BasicNewsRecipe
import re
from calibre.web.feeds.news import BasicNewsRecipe
from calibre.ebooks.BeautifulSoup import BeautifulStoneSoup
class CanWestPaper(BasicNewsRecipe):
# un-comment the following three lines for the Montreal Gazette
# un-comment the following four lines for the Victoria Times Colonist
## title = u'Victoria Times Colonist'
## url_prefix = 'http://www.timescolonist.com'
## description = u'News from Victoria, BC'
## fp_tag = 'CAN_TC'
# un-comment the following four lines for the Vancouver Province
## title = u'Vancouver Province'
## url_prefix = 'http://www.theprovince.com'
## description = u'News from Vancouver, BC'
## fp_tag = 'CAN_VP'
# un-comment the following four lines for the Vancouver Sun
## title = u'Vancouver Sun'
## url_prefix = 'http://www.vancouversun.com'
## description = u'News from Vancouver, BC'
## fp_tag = 'CAN_VS'
# un-comment the following four lines for the Edmonton Journal
## title = u'Edmonton Journal'
## url_prefix = 'http://www.edmontonjournal.com'
## description = u'News from Edmonton, AB'
## fp_tag = 'CAN_EJ'
# un-comment the following four lines for the Calgary Herald
## title = u'Calgary Herald'
## url_prefix = 'http://www.calgaryherald.com'
## description = u'News from Calgary, AB'
## fp_tag = 'CAN_CH'
# un-comment the following four lines for the Regina Leader-Post
## title = u'Regina Leader-Post'
## url_prefix = 'http://www.leaderpost.com'
## description = u'News from Regina, SK'
## fp_tag = ''
# un-comment the following four lines for the Saskatoon Star-Phoenix
## title = u'Saskatoon Star-Phoenix'
## url_prefix = 'http://www.thestarphoenix.com'
## description = u'News from Saskatoon, SK'
## fp_tag = ''
# un-comment the following four lines for the Windsor Star
## title = u'Windsor Star'
## url_prefix = 'http://www.windsorstar.com'
## description = u'News from Windsor, ON'
## fp_tag = 'CAN_'
# un-comment the following four lines for the Ottawa Citizen
## title = u'Ottawa Citizen'
## url_prefix = 'http://www.ottawacitizen.com'
## description = u'News from Ottawa, ON'
## fp_tag = 'CAN_OC'
# un-comment the following four lines for the Montreal Gazette
title = u'Montreal Gazette'
url_prefix = 'http://www.montrealgazette.com'
description = u'News from Montreal, QC'
fp_tag = 'CAN_MG'
language = 'en_CA'
@ -38,14 +96,81 @@ class CanWestPaper(BasicNewsRecipe):
dict(name='div', attrs={'class':'rule_grey_solid'}),
dict(name='li', attrs={'class':'print'}),dict(name='li', attrs={'class':'share'}),dict(name='ul', attrs={'class':'bullet'})]
def preprocess_html(self,soup):
#delete iempty id attributes--they screw up the TOC for unknow reasons
divtags = soup.findAll('div',attrs={'id':''})
if divtags:
for div in divtags:
del(div['id'])
def get_cover_url(self):
from datetime import timedelta, date
if self.fp_tag=='':
return None
cover = 'http://webmedia.newseum.org/newseum-multimedia/dfp/jpg'+str(date.today().day)+'/lg/'+self.fp_tag+'.jpg'
br = BasicNewsRecipe.get_browser()
daysback=1
try:
br.open(cover)
except:
while daysback<7:
cover = 'http://webmedia.newseum.org/newseum-multimedia/dfp/jpg'+str((date.today() - timedelta(days=daysback)).day)+'/lg/'+self.fp_tag+'.jpg'
br = BasicNewsRecipe.get_browser()
try:
br.open(cover)
except:
daysback = daysback+1
continue
break
if daysback==7:
self.log("\nCover unavailable")
cover = None
return cover
def fixChars(self,string):
# Replace lsquo (\x91)
fixed = re.sub("\x91","",string)
# Replace rsquo (\x92)
fixed = re.sub("\x92","",fixed)
# Replace ldquo (\x93)
fixed = re.sub("\x93","“",fixed)
# Replace rdquo (\x94)
fixed = re.sub("\x94","”",fixed)
# Replace ndash (\x96)
fixed = re.sub("\x96","",fixed)
# Replace mdash (\x97)
fixed = re.sub("\x97","—",fixed)
fixed = re.sub("&#x2019;","",fixed)
return fixed
def massageNCXText(self, description):
# Kindle TOC descriptions won't render certain characters
if description:
massaged = unicode(BeautifulStoneSoup(description, convertEntities=BeautifulStoneSoup.HTML_ENTITIES))
# Replace '&' with '&'
massaged = re.sub("&","&", massaged)
return self.fixChars(massaged)
else:
return description
def populate_article_metadata(self, article, soup, first):
if first:
picdiv = soup.find('body').find('img')
if picdiv is not None:
self.add_toc_thumbnail(article,re.sub(r'links\\link\d+\\','',picdiv['src']))
xtitle = article.text_summary.strip()
if len(xtitle) == 0:
desc = soup.find('meta',attrs={'property':'og:description'})
if desc is not None:
article.summary = article.text_summary = desc['content']
def strip_anchors(self,soup):
paras = soup.findAll(True)
for para in paras:
aTags = para.findAll('a')
for a in aTags:
if a.img is None:
a.replaceWith(a.renderContents().decode('cp1252','replace'))
return soup
def preprocess_html(self, soup):
return self.strip_anchors(soup)
def parse_index(self):
soup = self.index_to_soup(self.url_prefix+'/news/todays-paper/index.html')

21
recipes/onda_rock.recipe Normal file
View File

@ -0,0 +1,21 @@
__license__ = 'GPL v3'
from calibre.web.feeds.news import BasicNewsRecipe
class AdvancedUserRecipe1328535130(BasicNewsRecipe):
title = u'Onda Rock'
__author__ = 'faber1971'
description = 'Italian rock webzine'
language = 'it'
oldest_article = 7
max_articles_per_feed = 100
auto_cleanup = False
remove_tags = [
dict(name='div', attrs={'id':['boxHeader','boxlinks_med','footer','boxinterviste','box_special_med','boxdiscografia_head','path']}),
dict(name='div', attrs={'align':'left'}),
dict(name='div', attrs={'style':'text-align: center'}),
]
no_stylesheets = True
feeds = [(u'Onda Rock', u'http://www.ondarock.it/feed.php')]
masthead_url = 'http://profile.ak.fbcdn.net/hprofile-ak-snc4/71135_45820579767_4993043_n.jpg'

View File

@ -1,4 +1,5 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
__license__ = 'GPL v3'
@ -6,20 +7,72 @@ __license__ = 'GPL v3'
www.canada.com
'''
from calibre.web.feeds.recipes import BasicNewsRecipe
import re
from calibre.web.feeds.news import BasicNewsRecipe
from calibre.ebooks.BeautifulSoup import BeautifulStoneSoup
class CanWestPaper(BasicNewsRecipe):
# un-comment the following three lines for the Ottawa Citizen
# un-comment the following four lines for the Victoria Times Colonist
## title = u'Victoria Times Colonist'
## url_prefix = 'http://www.timescolonist.com'
## description = u'News from Victoria, BC'
## fp_tag = 'CAN_TC'
# un-comment the following four lines for the Vancouver Province
## title = u'Vancouver Province'
## url_prefix = 'http://www.theprovince.com'
## description = u'News from Vancouver, BC'
## fp_tag = 'CAN_VP'
# un-comment the following four lines for the Vancouver Sun
## title = u'Vancouver Sun'
## url_prefix = 'http://www.vancouversun.com'
## description = u'News from Vancouver, BC'
## fp_tag = 'CAN_VS'
# un-comment the following four lines for the Edmonton Journal
## title = u'Edmonton Journal'
## url_prefix = 'http://www.edmontonjournal.com'
## description = u'News from Edmonton, AB'
## fp_tag = 'CAN_EJ'
# un-comment the following four lines for the Calgary Herald
## title = u'Calgary Herald'
## url_prefix = 'http://www.calgaryherald.com'
## description = u'News from Calgary, AB'
## fp_tag = 'CAN_CH'
# un-comment the following four lines for the Regina Leader-Post
## title = u'Regina Leader-Post'
## url_prefix = 'http://www.leaderpost.com'
## description = u'News from Regina, SK'
## fp_tag = ''
# un-comment the following four lines for the Saskatoon Star-Phoenix
## title = u'Saskatoon Star-Phoenix'
## url_prefix = 'http://www.thestarphoenix.com'
## description = u'News from Saskatoon, SK'
## fp_tag = ''
# un-comment the following four lines for the Windsor Star
## title = u'Windsor Star'
## url_prefix = 'http://www.windsorstar.com'
## description = u'News from Windsor, ON'
## fp_tag = 'CAN_'
# un-comment the following four lines for the Ottawa Citizen
title = u'Ottawa Citizen'
url_prefix = 'http://www.ottawacitizen.com'
description = u'News from Ottawa, ON'
fp_tag = 'CAN_OC'
# un-comment the following three lines for the Montreal Gazette
#title = u'Montreal Gazette'
#url_prefix = 'http://www.montrealgazette.com'
#description = u'News from Montreal, QC'
# un-comment the following four lines for the Montreal Gazette
## title = u'Montreal Gazette'
## url_prefix = 'http://www.montrealgazette.com'
## description = u'News from Montreal, QC'
## fp_tag = 'CAN_MG'
language = 'en_CA'
@ -43,14 +96,80 @@ class CanWestPaper(BasicNewsRecipe):
dict(name='div', attrs={'class':'rule_grey_solid'}),
dict(name='li', attrs={'class':'print'}),dict(name='li', attrs={'class':'share'}),dict(name='ul', attrs={'class':'bullet'})]
def preprocess_html(self,soup):
#delete iempty id attributes--they screw up the TOC for unknow reasons
divtags = soup.findAll('div',attrs={'id':''})
if divtags:
for div in divtags:
del(div['id'])
def get_cover_url(self):
from datetime import timedelta, date
if self.fp_tag=='':
return None
cover = 'http://webmedia.newseum.org/newseum-multimedia/dfp/jpg'+str(date.today().day)+'/lg/'+self.fp_tag+'.jpg'
br = BasicNewsRecipe.get_browser()
daysback=1
try:
br.open(cover)
except:
while daysback<7:
cover = 'http://webmedia.newseum.org/newseum-multimedia/dfp/jpg'+str((date.today() - timedelta(days=daysback)).day)+'/lg/'+self.fp_tag+'.jpg'
br = BasicNewsRecipe.get_browser()
try:
br.open(cover)
except:
daysback = daysback+1
continue
break
if daysback==7:
self.log("\nCover unavailable")
cover = None
return cover
def fixChars(self,string):
# Replace lsquo (\x91)
fixed = re.sub("\x91","",string)
# Replace rsquo (\x92)
fixed = re.sub("\x92","",fixed)
# Replace ldquo (\x93)
fixed = re.sub("\x93","“",fixed)
# Replace rdquo (\x94)
fixed = re.sub("\x94","”",fixed)
# Replace ndash (\x96)
fixed = re.sub("\x96","",fixed)
# Replace mdash (\x97)
fixed = re.sub("\x97","—",fixed)
fixed = re.sub("&#x2019;","",fixed)
return fixed
def massageNCXText(self, description):
# Kindle TOC descriptions won't render certain characters
if description:
massaged = unicode(BeautifulStoneSoup(description, convertEntities=BeautifulStoneSoup.HTML_ENTITIES))
# Replace '&' with '&'
massaged = re.sub("&","&", massaged)
return self.fixChars(massaged)
else:
return description
def populate_article_metadata(self, article, soup, first):
if first:
picdiv = soup.find('body').find('img')
if picdiv is not None:
self.add_toc_thumbnail(article,re.sub(r'links\\link\d+\\','',picdiv['src']))
xtitle = article.text_summary.strip()
if len(xtitle) == 0:
desc = soup.find('meta',attrs={'property':'og:description'})
if desc is not None:
article.summary = article.text_summary = desc['content']
def strip_anchors(self,soup):
paras = soup.findAll(True)
for para in paras:
aTags = para.findAll('a')
for a in aTags:
if a.img is None:
a.replaceWith(a.renderContents().decode('cp1252','replace'))
return soup
def preprocess_html(self, soup):
return self.strip_anchors(soup)
def parse_index(self):
soup = self.index_to_soup(self.url_prefix+'/news/todays-paper/index.html')

View File

@ -1,18 +1,18 @@
# -*- coding: utf-8 -*-
__license__ = 'GPL v3'
__copyright__ = '2008-2010, Darko Miletic <darko.miletic at gmail.com>'
__copyright__ = '2008-2012, Darko Miletic <darko.miletic at gmail.com>'
'''
pescanik.net
'''
import re
from calibre.web.feeds.news import BasicNewsRecipe
from calibre.ebooks.BeautifulSoup import Tag
class Pescanik(BasicNewsRecipe):
title = 'Pescanik'
title = 'Peščanik'
__author__ = 'Darko Miletic'
description = 'Pescanik'
publisher = 'Pescanik'
description = 'Peščanik je udruženje građana osnovano 2006. godine. Glavni proizvod Peščanika je radio emisija koja je emitovana na Radiju B92 od 02.02.2000. do 16.06.2011, a od septembra 2011. se emituje na osam radio stanica u Srbiji, Crnoj Gori i BiH'
publisher = 'Peščanik'
category = 'news, politics, Serbia'
oldest_article = 10
max_articles_per_feed = 100
@ -20,8 +20,13 @@ class Pescanik(BasicNewsRecipe):
use_embedded_content = False
encoding = 'utf-8'
language = 'sr'
publication_type = 'newsportal'
extra_css = ' @font-face {font-family: "serif1";src:url(res:///opt/sony/ebook/FONT/tt0011m_.ttf)} @font-face {font-family: "sans1";src:url(res:///opt/sony/ebook/FONT/tt0003m_.ttf)} .article_description,body{font-family: Arial,"Lucida Grande",Tahoma,Verdana,sans1,sans-serif} .contentheading{font-size: x-large; font-weight: bold} .small{font-size: small} .createdate{font-size: x-small; font-weight: bold} '
publication_type = 'newsportal'
masthead_url = 'http://pescanik.net/wp-content/uploads/2011/10/logo1.png'
extra_css = """
@font-face {font-family: "sans1";src:url(res:///opt/sony/ebook/FONT/tt0003m_.ttf)}
body{font-family: Verdana,Arial,Tahoma,sans1,sans-serif}
#BlogTitle{font-size: xx-large; font-weight: bold}
"""
conversion_options = {
'comment' : description
@ -32,29 +37,12 @@ class Pescanik(BasicNewsRecipe):
preprocess_regexps = [(re.compile(u'\u0110'), lambda match: u'\u00D0')]
remove_attributes = ['valign','colspan','width','height','align','alt']
remove_tags = [dict(name=['object','link','meta','script'])]
keep_only_tags = [
dict(attrs={'class':['contentheading','small','createdate']})
,dict(name='td', attrs={'valign':'top','colspan':'2'})
]
feeds = [(u'Pescanik Online', u'http://www.pescanik.net/index.php?option=com_rd_rss&id=12')]
remove_tags = [dict(name=['object','link','meta','script','iframe','embed'])]
keep_only_tags = [dict(attrs={'id':['BlogTitle','BlogDate','BlogContent']})]
feeds = [
(u'Autori' , u'http://pescanik.net/category/autori/feed/'),
(u'Prevodi', u'http://pescanik.net/category/prevodi/feed/')
]
def print_version(self, url):
nurl = url.replace('/index.php','/index2.php')
return nurl + '&pop=1&page=0'
def preprocess_html(self, soup):
st = soup.findAll('td')
for it in st:
it.name='p'
for pt in soup.findAll('img'):
brtag = Tag(soup,'br')
brtag2 = Tag(soup,'br')
pt.append(brtag)
pt.append(brtag2)
return soup
return url + 'print/'

View File

@ -1,4 +1,5 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
__license__ = 'GPL v3'
@ -6,35 +7,72 @@ __license__ = 'GPL v3'
www.canada.com
'''
from calibre.web.feeds.recipes import BasicNewsRecipe
import re
from calibre.web.feeds.news import BasicNewsRecipe
from calibre.ebooks.BeautifulSoup import BeautifulStoneSoup
class CanWestPaper(BasicNewsRecipe):
# un-comment the following three lines for the Regina Leader-Post
# un-comment the following four lines for the Victoria Times Colonist
## title = u'Victoria Times Colonist'
## url_prefix = 'http://www.timescolonist.com'
## description = u'News from Victoria, BC'
## fp_tag = 'CAN_TC'
# un-comment the following four lines for the Vancouver Province
## title = u'Vancouver Province'
## url_prefix = 'http://www.theprovince.com'
## description = u'News from Vancouver, BC'
## fp_tag = 'CAN_VP'
# un-comment the following four lines for the Vancouver Sun
## title = u'Vancouver Sun'
## url_prefix = 'http://www.vancouversun.com'
## description = u'News from Vancouver, BC'
## fp_tag = 'CAN_VS'
# un-comment the following four lines for the Edmonton Journal
## title = u'Edmonton Journal'
## url_prefix = 'http://www.edmontonjournal.com'
## description = u'News from Edmonton, AB'
## fp_tag = 'CAN_EJ'
# un-comment the following four lines for the Calgary Herald
## title = u'Calgary Herald'
## url_prefix = 'http://www.calgaryherald.com'
## description = u'News from Calgary, AB'
## fp_tag = 'CAN_CH'
# un-comment the following four lines for the Regina Leader-Post
title = u'Regina Leader-Post'
url_prefix = 'http://www.leaderpost.com'
description = u'News from Regina, SK'
fp_tag = ''
# un-comment the following three lines for the Saskatoon Star-Phoenix
#title = u'Saskatoon Star-Phoenix'
#url_prefix = 'http://www.thestarphoenix.com'
#description = u'News from Saskatoon, SK'
# un-comment the following four lines for the Saskatoon Star-Phoenix
## title = u'Saskatoon Star-Phoenix'
## url_prefix = 'http://www.thestarphoenix.com'
## description = u'News from Saskatoon, SK'
## fp_tag = ''
# un-comment the following three lines for the Windsor Star
#title = u'Windsor Star'
#url_prefix = 'http://www.windsorstar.com'
#description = u'News from Windsor, ON'
# un-comment the following four lines for the Windsor Star
## title = u'Windsor Star'
## url_prefix = 'http://www.windsorstar.com'
## description = u'News from Windsor, ON'
## fp_tag = 'CAN_'
# un-comment the following three lines for the Ottawa Citizen
#title = u'Ottawa Citizen'
#url_prefix = 'http://www.ottawacitizen.com'
#description = u'News from Ottawa, ON'
# un-comment the following four lines for the Ottawa Citizen
## title = u'Ottawa Citizen'
## url_prefix = 'http://www.ottawacitizen.com'
## description = u'News from Ottawa, ON'
## fp_tag = 'CAN_OC'
# un-comment the following three lines for the Montreal Gazette
#title = u'Montreal Gazette'
#url_prefix = 'http://www.montrealgazette.com'
#description = u'News from Montreal, QC'
# un-comment the following four lines for the Montreal Gazette
## title = u'Montreal Gazette'
## url_prefix = 'http://www.montrealgazette.com'
## description = u'News from Montreal, QC'
## fp_tag = 'CAN_MG'
language = 'en_CA'
@ -58,14 +96,80 @@ class CanWestPaper(BasicNewsRecipe):
dict(name='div', attrs={'class':'rule_grey_solid'}),
dict(name='li', attrs={'class':'print'}),dict(name='li', attrs={'class':'share'}),dict(name='ul', attrs={'class':'bullet'})]
def preprocess_html(self,soup):
#delete iempty id attributes--they screw up the TOC for unknow reasons
divtags = soup.findAll('div',attrs={'id':''})
if divtags:
for div in divtags:
del(div['id'])
def get_cover_url(self):
from datetime import timedelta, date
if self.fp_tag=='':
return None
cover = 'http://webmedia.newseum.org/newseum-multimedia/dfp/jpg'+str(date.today().day)+'/lg/'+self.fp_tag+'.jpg'
br = BasicNewsRecipe.get_browser()
daysback=1
try:
br.open(cover)
except:
while daysback<7:
cover = 'http://webmedia.newseum.org/newseum-multimedia/dfp/jpg'+str((date.today() - timedelta(days=daysback)).day)+'/lg/'+self.fp_tag+'.jpg'
br = BasicNewsRecipe.get_browser()
try:
br.open(cover)
except:
daysback = daysback+1
continue
break
if daysback==7:
self.log("\nCover unavailable")
cover = None
return cover
def fixChars(self,string):
# Replace lsquo (\x91)
fixed = re.sub("\x91","",string)
# Replace rsquo (\x92)
fixed = re.sub("\x92","",fixed)
# Replace ldquo (\x93)
fixed = re.sub("\x93","“",fixed)
# Replace rdquo (\x94)
fixed = re.sub("\x94","”",fixed)
# Replace ndash (\x96)
fixed = re.sub("\x96","",fixed)
# Replace mdash (\x97)
fixed = re.sub("\x97","—",fixed)
fixed = re.sub("&#x2019;","",fixed)
return fixed
def massageNCXText(self, description):
# Kindle TOC descriptions won't render certain characters
if description:
massaged = unicode(BeautifulStoneSoup(description, convertEntities=BeautifulStoneSoup.HTML_ENTITIES))
# Replace '&' with '&'
massaged = re.sub("&","&", massaged)
return self.fixChars(massaged)
else:
return description
def populate_article_metadata(self, article, soup, first):
if first:
picdiv = soup.find('body').find('img')
if picdiv is not None:
self.add_toc_thumbnail(article,re.sub(r'links\\link\d+\\','',picdiv['src']))
xtitle = article.text_summary.strip()
if len(xtitle) == 0:
desc = soup.find('meta',attrs={'property':'og:description'})
if desc is not None:
article.summary = article.text_summary = desc['content']
def strip_anchors(self,soup):
paras = soup.findAll(True)
for para in paras:
aTags = para.findAll('a')
for a in aTags:
if a.img is None:
a.replaceWith(a.renderContents().decode('cp1252','replace'))
return soup
def preprocess_html(self, soup):
return self.strip_anchors(soup)
def parse_index(self):
soup = self.index_to_soup(self.url_prefix+'/news/todays-paper/index.html')

View File

@ -1,4 +1,5 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
__license__ = 'GPL v3'
@ -6,30 +7,72 @@ __license__ = 'GPL v3'
www.canada.com
'''
from calibre.web.feeds.recipes import BasicNewsRecipe
import re
from calibre.web.feeds.news import BasicNewsRecipe
from calibre.ebooks.BeautifulSoup import BeautifulStoneSoup
class CanWestPaper(BasicNewsRecipe):
# un-comment the following three lines for the Saskatoon Star-Phoenix
# un-comment the following four lines for the Victoria Times Colonist
## title = u'Victoria Times Colonist'
## url_prefix = 'http://www.timescolonist.com'
## description = u'News from Victoria, BC'
## fp_tag = 'CAN_TC'
# un-comment the following four lines for the Vancouver Province
## title = u'Vancouver Province'
## url_prefix = 'http://www.theprovince.com'
## description = u'News from Vancouver, BC'
## fp_tag = 'CAN_VP'
# un-comment the following four lines for the Vancouver Sun
## title = u'Vancouver Sun'
## url_prefix = 'http://www.vancouversun.com'
## description = u'News from Vancouver, BC'
## fp_tag = 'CAN_VS'
# un-comment the following four lines for the Edmonton Journal
## title = u'Edmonton Journal'
## url_prefix = 'http://www.edmontonjournal.com'
## description = u'News from Edmonton, AB'
## fp_tag = 'CAN_EJ'
# un-comment the following four lines for the Calgary Herald
## title = u'Calgary Herald'
## url_prefix = 'http://www.calgaryherald.com'
## description = u'News from Calgary, AB'
## fp_tag = 'CAN_CH'
# un-comment the following four lines for the Regina Leader-Post
## title = u'Regina Leader-Post'
## url_prefix = 'http://www.leaderpost.com'
## description = u'News from Regina, SK'
## fp_tag = ''
# un-comment the following four lines for the Saskatoon Star-Phoenix
title = u'Saskatoon Star-Phoenix'
url_prefix = 'http://www.thestarphoenix.com'
description = u'News from Saskatoon, SK'
fp_tag = ''
# un-comment the following three lines for the Windsor Star
#title = u'Windsor Star'
#url_prefix = 'http://www.windsorstar.com'
#description = u'News from Windsor, ON'
# un-comment the following four lines for the Windsor Star
## title = u'Windsor Star'
## url_prefix = 'http://www.windsorstar.com'
## description = u'News from Windsor, ON'
## fp_tag = 'CAN_'
# un-comment the following three lines for the Ottawa Citizen
#title = u'Ottawa Citizen'
#url_prefix = 'http://www.ottawacitizen.com'
#description = u'News from Ottawa, ON'
# un-comment the following four lines for the Ottawa Citizen
## title = u'Ottawa Citizen'
## url_prefix = 'http://www.ottawacitizen.com'
## description = u'News from Ottawa, ON'
## fp_tag = 'CAN_OC'
# un-comment the following three lines for the Montreal Gazette
#title = u'Montreal Gazette'
#url_prefix = 'http://www.montrealgazette.com'
#description = u'News from Montreal, QC'
# un-comment the following four lines for the Montreal Gazette
## title = u'Montreal Gazette'
## url_prefix = 'http://www.montrealgazette.com'
## description = u'News from Montreal, QC'
## fp_tag = 'CAN_MG'
language = 'en_CA'
@ -53,14 +96,80 @@ class CanWestPaper(BasicNewsRecipe):
dict(name='div', attrs={'class':'rule_grey_solid'}),
dict(name='li', attrs={'class':'print'}),dict(name='li', attrs={'class':'share'}),dict(name='ul', attrs={'class':'bullet'})]
def preprocess_html(self,soup):
#delete iempty id attributes--they screw up the TOC for unknow reasons
divtags = soup.findAll('div',attrs={'id':''})
if divtags:
for div in divtags:
del(div['id'])
def get_cover_url(self):
from datetime import timedelta, date
if self.fp_tag=='':
return None
cover = 'http://webmedia.newseum.org/newseum-multimedia/dfp/jpg'+str(date.today().day)+'/lg/'+self.fp_tag+'.jpg'
br = BasicNewsRecipe.get_browser()
daysback=1
try:
br.open(cover)
except:
while daysback<7:
cover = 'http://webmedia.newseum.org/newseum-multimedia/dfp/jpg'+str((date.today() - timedelta(days=daysback)).day)+'/lg/'+self.fp_tag+'.jpg'
br = BasicNewsRecipe.get_browser()
try:
br.open(cover)
except:
daysback = daysback+1
continue
break
if daysback==7:
self.log("\nCover unavailable")
cover = None
return cover
def fixChars(self,string):
# Replace lsquo (\x91)
fixed = re.sub("\x91","",string)
# Replace rsquo (\x92)
fixed = re.sub("\x92","",fixed)
# Replace ldquo (\x93)
fixed = re.sub("\x93","“",fixed)
# Replace rdquo (\x94)
fixed = re.sub("\x94","”",fixed)
# Replace ndash (\x96)
fixed = re.sub("\x96","",fixed)
# Replace mdash (\x97)
fixed = re.sub("\x97","—",fixed)
fixed = re.sub("&#x2019;","",fixed)
return fixed
def massageNCXText(self, description):
# Kindle TOC descriptions won't render certain characters
if description:
massaged = unicode(BeautifulStoneSoup(description, convertEntities=BeautifulStoneSoup.HTML_ENTITIES))
# Replace '&' with '&'
massaged = re.sub("&","&", massaged)
return self.fixChars(massaged)
else:
return description
def populate_article_metadata(self, article, soup, first):
if first:
picdiv = soup.find('body').find('img')
if picdiv is not None:
self.add_toc_thumbnail(article,re.sub(r'links\\link\d+\\','',picdiv['src']))
xtitle = article.text_summary.strip()
if len(xtitle) == 0:
desc = soup.find('meta',attrs={'property':'og:description'})
if desc is not None:
article.summary = article.text_summary = desc['content']
def strip_anchors(self,soup):
paras = soup.findAll(True)
for para in paras:
aTags = para.findAll('a')
for a in aTags:
if a.img is None:
a.replaceWith(a.renderContents().decode('cp1252','replace'))
return soup
def preprocess_html(self, soup):
return self.strip_anchors(soup)
def parse_index(self):
soup = self.index_to_soup(self.url_prefix+'/news/todays-paper/index.html')

View File

@ -1,24 +1,41 @@
from calibre.web.feeds.news import BasicNewsRecipe
## History:
## 1: Base Version
## 2: Added rules for wdr.de, ndr.de, br-online.de
## 3: Added rules for rbb-online.de, boerse.ard.de, sportschau.de
class Tagesschau(BasicNewsRecipe):
title = 'Tagesschau'
description = 'Nachrichten der ARD'
publisher = 'ARD'
language = 'de'
version = 3
__author__ = 'Florian Andreas Pfaff'
oldest_article = 7
__author__ = 'Florian Andreas Pfaff, a.peter'
oldest_article = 7
max_articles_per_feed = 100
no_stylesheets = True
no_stylesheets = True
remove_javascript = True
feeds = [('Tagesschau', 'http://www.tagesschau.de/xml/rss2')]
remove_tags = [
dict(name='div', attrs={'class':['linksZumThema schmal','teaserBox','boxMoreLinks','directLinks','teaserBox boxtext','fPlayer','zitatBox breit flashaudio']}),
dict(name='div',
attrs={'id':['socialBookmarks','seitenanfang']}),
dict(name='ul',
attrs={'class':['directLinks','directLinks weltatlas']}),
dict(name='strong', attrs={'class':['boxTitle inv','inv']})
dict(name='div', attrs={'class':['linksZumThema schmal','teaserBox','boxMoreLinks','directLinks','teaserBox boxtext','fPlayer','zitatBox breit flashaudio','infobox ','footer clearfix','inner recommendations','teaser teaser-08 nachrichten smallstandard','infobox-rechts','infobox-links','csl2','teaserBox metaBlock','articleA archiveDisclaimer']}),
dict(name='div', attrs={'id':['pageFunctions']}), ## wdr.de
dict(name='div', attrs={'class':['chart','footerService','toplink','assetsLeft','assetsFullsize']}), ## boerse.ard.de
dict(name='div', attrs={'class':['ardMehrZumThemaLinks','socialBookmarks','ardContentEnd','ardDisclaimer']}), ## sportschau.de
dict(name='div', attrs={'id':['socialBookmarks','seitenanfang','comment']}),
dict(name='ul', attrs={'class':['directLinks','directLinks weltatlas','iconList','right']}),
dict(name='strong', attrs={'class':['boxTitle inv','inv']}),
dict(name='div', attrs={'class':['moreInfo right','moreInfo']}),
dict(name='span', attrs={'class':['videoLink']}),
dict(name='img', attrs={'class':['zoom float_right']}),
dict(name='a', attrs={'id':['zoom']})
]
keep_only_tags = [dict(name='div', attrs={'id':'centerCol'})]
keep_only_tags = [dict(name='div', attrs={'id':'centerCol'}),
dict(name='div', attrs={'id':['mainColumn','ardContent']}),
dict(name='div', attrs={'class':['narrow clearfix','beitrag','detail_inlay','containerArticle noBorder','span-8']})]
def get_masthead_url(self):
return 'http://intern.tagesschau.de/html/img/image.jpg'

View File

@ -0,0 +1,220 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
__license__ = 'GPL v3'
'''
www.canada.com
'''
import re
from calibre.web.feeds.news import BasicNewsRecipe
from calibre.ebooks.BeautifulSoup import BeautifulStoneSoup
class CanWestPaper(BasicNewsRecipe):
# un-comment the following four lines for the Victoria Times Colonist
## title = u'Victoria Times Colonist'
## url_prefix = 'http://www.timescolonist.com'
## description = u'News from Victoria, BC'
## fp_tag = 'CAN_TC'
# un-comment the following four lines for the Vancouver Province
title = u'Vancouver Province'
url_prefix = 'http://www.theprovince.com'
description = u'News from Vancouver, BC'
fp_tag = 'CAN_VP'
# un-comment the following four lines for the Vancouver Sun
## title = u'Vancouver Sun'
## url_prefix = 'http://www.vancouversun.com'
## description = u'News from Vancouver, BC'
## fp_tag = 'CAN_VS'
# un-comment the following four lines for the Edmonton Journal
## title = u'Edmonton Journal'
## url_prefix = 'http://www.edmontonjournal.com'
## description = u'News from Edmonton, AB'
## fp_tag = 'CAN_EJ'
# un-comment the following four lines for the Calgary Herald
## title = u'Calgary Herald'
## url_prefix = 'http://www.calgaryherald.com'
## description = u'News from Calgary, AB'
## fp_tag = 'CAN_CH'
# un-comment the following four lines for the Regina Leader-Post
## title = u'Regina Leader-Post'
## url_prefix = 'http://www.leaderpost.com'
## description = u'News from Regina, SK'
## fp_tag = ''
# un-comment the following four lines for the Saskatoon Star-Phoenix
## title = u'Saskatoon Star-Phoenix'
## url_prefix = 'http://www.thestarphoenix.com'
## description = u'News from Saskatoon, SK'
## fp_tag = ''
# un-comment the following four lines for the Windsor Star
## title = u'Windsor Star'
## url_prefix = 'http://www.windsorstar.com'
## description = u'News from Windsor, ON'
## fp_tag = 'CAN_'
# un-comment the following four lines for the Ottawa Citizen
## title = u'Ottawa Citizen'
## url_prefix = 'http://www.ottawacitizen.com'
## description = u'News from Ottawa, ON'
## fp_tag = 'CAN_OC'
# un-comment the following four lines for the Montreal Gazette
## title = u'Montreal Gazette'
## url_prefix = 'http://www.montrealgazette.com'
## description = u'News from Montreal, QC'
## fp_tag = 'CAN_MG'
language = 'en_CA'
__author__ = 'Nick Redding'
no_stylesheets = True
timefmt = ' [%b %d]'
extra_css = '''
.timestamp { font-size:xx-small; display: block; }
#storyheader { font-size: medium; }
#storyheader h1 { font-size: x-large; }
#storyheader h2 { font-size: large; font-style: italic; }
.byline { font-size:xx-small; }
#photocaption { font-size: small; font-style: italic }
#photocredit { font-size: xx-small; }'''
keep_only_tags = [dict(name='div', attrs={'id':'storyheader'}),dict(name='div', attrs={'id':'storycontent'})]
remove_tags = [{'class':'comments'},
dict(name='div', attrs={'class':'navbar'}),dict(name='div', attrs={'class':'morelinks'}),
dict(name='div', attrs={'class':'viewmore'}),dict(name='li', attrs={'class':'email'}),
dict(name='div', attrs={'class':'story_tool_hr'}),dict(name='div', attrs={'class':'clear'}),
dict(name='div', attrs={'class':'story_tool'}),dict(name='div', attrs={'class':'copyright'}),
dict(name='div', attrs={'class':'rule_grey_solid'}),
dict(name='li', attrs={'class':'print'}),dict(name='li', attrs={'class':'share'}),dict(name='ul', attrs={'class':'bullet'})]
def get_cover_url(self):
from datetime import timedelta, date
if self.fp_tag=='':
return None
cover = 'http://webmedia.newseum.org/newseum-multimedia/dfp/jpg'+str(date.today().day)+'/lg/'+self.fp_tag+'.jpg'
br = BasicNewsRecipe.get_browser()
daysback=1
try:
br.open(cover)
except:
while daysback<7:
cover = 'http://webmedia.newseum.org/newseum-multimedia/dfp/jpg'+str((date.today() - timedelta(days=daysback)).day)+'/lg/'+self.fp_tag+'.jpg'
br = BasicNewsRecipe.get_browser()
try:
br.open(cover)
except:
daysback = daysback+1
continue
break
if daysback==7:
self.log("\nCover unavailable")
cover = None
return cover
def fixChars(self,string):
# Replace lsquo (\x91)
fixed = re.sub("\x91","",string)
# Replace rsquo (\x92)
fixed = re.sub("\x92","",fixed)
# Replace ldquo (\x93)
fixed = re.sub("\x93","“",fixed)
# Replace rdquo (\x94)
fixed = re.sub("\x94","”",fixed)
# Replace ndash (\x96)
fixed = re.sub("\x96","",fixed)
# Replace mdash (\x97)
fixed = re.sub("\x97","—",fixed)
fixed = re.sub("&#x2019;","",fixed)
return fixed
def massageNCXText(self, description):
# Kindle TOC descriptions won't render certain characters
if description:
massaged = unicode(BeautifulStoneSoup(description, convertEntities=BeautifulStoneSoup.HTML_ENTITIES))
# Replace '&' with '&'
massaged = re.sub("&","&", massaged)
return self.fixChars(massaged)
else:
return description
def populate_article_metadata(self, article, soup, first):
if first:
picdiv = soup.find('body').find('img')
if picdiv is not None:
self.add_toc_thumbnail(article,re.sub(r'links\\link\d+\\','',picdiv['src']))
xtitle = article.text_summary.strip()
if len(xtitle) == 0:
desc = soup.find('meta',attrs={'property':'og:description'})
if desc is not None:
article.summary = article.text_summary = desc['content']
def strip_anchors(self,soup):
paras = soup.findAll(True)
for para in paras:
aTags = para.findAll('a')
for a in aTags:
if a.img is None:
a.replaceWith(a.renderContents().decode('cp1252','replace'))
return soup
def preprocess_html(self, soup):
return self.strip_anchors(soup)
def parse_index(self):
soup = self.index_to_soup(self.url_prefix+'/news/todays-paper/index.html')
articles = {}
key = 'News'
ans = ['News']
# Find each instance of class="sectiontitle", class="featurecontent"
for divtag in soup.findAll('div',attrs={'class' : ["section_title02","featurecontent"]}):
#self.log(" div class = %s" % divtag['class'])
if divtag['class'].startswith('section_title'):
# div contains section title
if not divtag.h3:
continue
key = self.tag_to_string(divtag.h3,False)
ans.append(key)
self.log("Section name %s" % key)
continue
# div contains article data
h1tag = divtag.find('h1')
if not h1tag:
continue
atag = h1tag.find('a',href=True)
if not atag:
continue
url = self.url_prefix+'/news/todays-paper/'+atag['href']
#self.log("Section %s" % key)
#self.log("url %s" % url)
title = self.tag_to_string(atag,False)
#self.log("title %s" % title)
pubdate = ''
description = ''
ptag = divtag.find('p');
if ptag:
description = self.tag_to_string(ptag,False)
#self.log("description %s" % description)
author = ''
autag = divtag.find('h4')
if autag:
author = self.tag_to_string(autag,False)
#self.log("author %s" % author)
if not articles.has_key(key):
articles[key] = []
articles[key].append(dict(title=title,url=url,date=pubdate,description=description,author=author,content=''))
ans = [(key, articles[key]) for key in ans if articles.has_key(key)]
return ans

View File

@ -1,4 +1,5 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
__license__ = 'GPL v3'
@ -6,50 +7,72 @@ __license__ = 'GPL v3'
www.canada.com
'''
from calibre.web.feeds.recipes import BasicNewsRecipe
import re
from calibre.web.feeds.news import BasicNewsRecipe
from calibre.ebooks.BeautifulSoup import BeautifulStoneSoup
class CanWestPaper(BasicNewsRecipe):
# un-comment the following three lines for the Vancouver Sun
# un-comment the following four lines for the Victoria Times Colonist
## title = u'Victoria Times Colonist'
## url_prefix = 'http://www.timescolonist.com'
## description = u'News from Victoria, BC'
## fp_tag = 'CAN_TC'
# un-comment the following four lines for the Vancouver Province
## title = u'Vancouver Province'
## url_prefix = 'http://www.theprovince.com'
## description = u'News from Vancouver, BC'
## fp_tag = 'CAN_VP'
# un-comment the following four lines for the Vancouver Sun
title = u'Vancouver Sun'
url_prefix = 'http://www.vancouversun.com'
description = u'News from Vancouver, BC'
fp_tag = 'CAN_VS'
# un-comment the following three lines for the Edmonton Journal
#title = u'Edmonton Journal'
#url_prefix = 'http://www.edmontonjournal.com'
#description = u'News from Edmonton, AB'
# un-comment the following four lines for the Edmonton Journal
## title = u'Edmonton Journal'
## url_prefix = 'http://www.edmontonjournal.com'
## description = u'News from Edmonton, AB'
## fp_tag = 'CAN_EJ'
# un-comment the following three lines for the Calgary Herald
#title = u'Calgary Herald'
#url_prefix = 'http://www.calgaryherald.com'
#description = u'News from Calgary, AB'
# un-comment the following four lines for the Calgary Herald
## title = u'Calgary Herald'
## url_prefix = 'http://www.calgaryherald.com'
## description = u'News from Calgary, AB'
## fp_tag = 'CAN_CH'
# un-comment the following three lines for the Regina Leader-Post
#title = u'Regina Leader-Post'
#url_prefix = 'http://www.leaderpost.com'
#description = u'News from Regina, SK'
# un-comment the following four lines for the Regina Leader-Post
## title = u'Regina Leader-Post'
## url_prefix = 'http://www.leaderpost.com'
## description = u'News from Regina, SK'
## fp_tag = ''
# un-comment the following three lines for the Saskatoon Star-Phoenix
#title = u'Saskatoon Star-Phoenix'
#url_prefix = 'http://www.thestarphoenix.com'
#description = u'News from Saskatoon, SK'
# un-comment the following four lines for the Saskatoon Star-Phoenix
## title = u'Saskatoon Star-Phoenix'
## url_prefix = 'http://www.thestarphoenix.com'
## description = u'News from Saskatoon, SK'
## fp_tag = ''
# un-comment the following three lines for the Windsor Star
#title = u'Windsor Star'
#url_prefix = 'http://www.windsorstar.com'
#description = u'News from Windsor, ON'
# un-comment the following four lines for the Windsor Star
## title = u'Windsor Star'
## url_prefix = 'http://www.windsorstar.com'
## description = u'News from Windsor, ON'
## fp_tag = 'CAN_'
# un-comment the following three lines for the Ottawa Citizen
#title = u'Ottawa Citizen'
#url_prefix = 'http://www.ottawacitizen.com'
#description = u'News from Ottawa, ON'
# un-comment the following four lines for the Ottawa Citizen
## title = u'Ottawa Citizen'
## url_prefix = 'http://www.ottawacitizen.com'
## description = u'News from Ottawa, ON'
## fp_tag = 'CAN_OC'
# un-comment the following three lines for the Montreal Gazette
#title = u'Montreal Gazette'
#url_prefix = 'http://www.montrealgazette.com'
#description = u'News from Montreal, QC'
# un-comment the following four lines for the Montreal Gazette
## title = u'Montreal Gazette'
## url_prefix = 'http://www.montrealgazette.com'
## description = u'News from Montreal, QC'
## fp_tag = 'CAN_MG'
language = 'en_CA'
@ -73,14 +96,80 @@ class CanWestPaper(BasicNewsRecipe):
dict(name='div', attrs={'class':'rule_grey_solid'}),
dict(name='li', attrs={'class':'print'}),dict(name='li', attrs={'class':'share'}),dict(name='ul', attrs={'class':'bullet'})]
def preprocess_html(self,soup):
#delete iempty id attributes--they screw up the TOC for unknow reasons
divtags = soup.findAll('div',attrs={'id':''})
if divtags:
for div in divtags:
del(div['id'])
def get_cover_url(self):
from datetime import timedelta, date
if self.fp_tag=='':
return None
cover = 'http://webmedia.newseum.org/newseum-multimedia/dfp/jpg'+str(date.today().day)+'/lg/'+self.fp_tag+'.jpg'
br = BasicNewsRecipe.get_browser()
daysback=1
try:
br.open(cover)
except:
while daysback<7:
cover = 'http://webmedia.newseum.org/newseum-multimedia/dfp/jpg'+str((date.today() - timedelta(days=daysback)).day)+'/lg/'+self.fp_tag+'.jpg'
br = BasicNewsRecipe.get_browser()
try:
br.open(cover)
except:
daysback = daysback+1
continue
break
if daysback==7:
self.log("\nCover unavailable")
cover = None
return cover
def fixChars(self,string):
# Replace lsquo (\x91)
fixed = re.sub("\x91","",string)
# Replace rsquo (\x92)
fixed = re.sub("\x92","",fixed)
# Replace ldquo (\x93)
fixed = re.sub("\x93","“",fixed)
# Replace rdquo (\x94)
fixed = re.sub("\x94","”",fixed)
# Replace ndash (\x96)
fixed = re.sub("\x96","",fixed)
# Replace mdash (\x97)
fixed = re.sub("\x97","—",fixed)
fixed = re.sub("&#x2019;","",fixed)
return fixed
def massageNCXText(self, description):
# Kindle TOC descriptions won't render certain characters
if description:
massaged = unicode(BeautifulStoneSoup(description, convertEntities=BeautifulStoneSoup.HTML_ENTITIES))
# Replace '&' with '&'
massaged = re.sub("&","&", massaged)
return self.fixChars(massaged)
else:
return description
def populate_article_metadata(self, article, soup, first):
if first:
picdiv = soup.find('body').find('img')
if picdiv is not None:
self.add_toc_thumbnail(article,re.sub(r'links\\link\d+\\','',picdiv['src']))
xtitle = article.text_summary.strip()
if len(xtitle) == 0:
desc = soup.find('meta',attrs={'property':'og:description'})
if desc is not None:
article.summary = article.text_summary = desc['content']
def strip_anchors(self,soup):
paras = soup.findAll(True)
for para in paras:
aTags = para.findAll('a')
for a in aTags:
if a.img is None:
a.replaceWith(a.renderContents().decode('cp1252','replace'))
return soup
def preprocess_html(self, soup):
return self.strip_anchors(soup)
def parse_index(self):
soup = self.index_to_soup(self.url_prefix+'/news/todays-paper/index.html')

View File

@ -1,4 +1,5 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
__license__ = 'GPL v3'
@ -6,60 +7,72 @@ __license__ = 'GPL v3'
www.canada.com
'''
from calibre.web.feeds.recipes import BasicNewsRecipe
import re
from calibre.web.feeds.news import BasicNewsRecipe
from calibre.ebooks.BeautifulSoup import BeautifulStoneSoup
class CanWestPaper(BasicNewsRecipe):
# un-comment the following three lines for the Victoria Times Colonist
# un-comment the following four lines for the Victoria Times Colonist
title = u'Victoria Times Colonist'
url_prefix = 'http://www.timescolonist.com'
description = u'News from Victoria, BC'
fp_tag = 'CAN_TC'
# un-comment the following three lines for the Vancouver Province
#title = u'Vancouver Province'
#url_prefix = 'http://www.theprovince.com'
#description = u'News from Vancouver, BC'
# un-comment the following four lines for the Vancouver Province
## title = u'Vancouver Province'
## url_prefix = 'http://www.theprovince.com'
## description = u'News from Vancouver, BC'
## fp_tag = 'CAN_VP'
# un-comment the following three lines for the Vancouver Sun
#title = u'Vancouver Sun'
#url_prefix = 'http://www.vancouversun.com'
#description = u'News from Vancouver, BC'
# un-comment the following four lines for the Vancouver Sun
## title = u'Vancouver Sun'
## url_prefix = 'http://www.vancouversun.com'
## description = u'News from Vancouver, BC'
## fp_tag = 'CAN_VS'
# un-comment the following three lines for the Edmonton Journal
#title = u'Edmonton Journal'
#url_prefix = 'http://www.edmontonjournal.com'
#description = u'News from Edmonton, AB'
# un-comment the following four lines for the Edmonton Journal
## title = u'Edmonton Journal'
## url_prefix = 'http://www.edmontonjournal.com'
## description = u'News from Edmonton, AB'
## fp_tag = 'CAN_EJ'
# un-comment the following three lines for the Calgary Herald
#title = u'Calgary Herald'
#url_prefix = 'http://www.calgaryherald.com'
#description = u'News from Calgary, AB'
# un-comment the following four lines for the Calgary Herald
## title = u'Calgary Herald'
## url_prefix = 'http://www.calgaryherald.com'
## description = u'News from Calgary, AB'
## fp_tag = 'CAN_CH'
# un-comment the following three lines for the Regina Leader-Post
#title = u'Regina Leader-Post'
#url_prefix = 'http://www.leaderpost.com'
#description = u'News from Regina, SK'
# un-comment the following four lines for the Regina Leader-Post
## title = u'Regina Leader-Post'
## url_prefix = 'http://www.leaderpost.com'
## description = u'News from Regina, SK'
## fp_tag = ''
# un-comment the following three lines for the Saskatoon Star-Phoenix
#title = u'Saskatoon Star-Phoenix'
#url_prefix = 'http://www.thestarphoenix.com'
#description = u'News from Saskatoon, SK'
# un-comment the following four lines for the Saskatoon Star-Phoenix
## title = u'Saskatoon Star-Phoenix'
## url_prefix = 'http://www.thestarphoenix.com'
## description = u'News from Saskatoon, SK'
## fp_tag = ''
# un-comment the following three lines for the Windsor Star
#title = u'Windsor Star'
#url_prefix = 'http://www.windsorstar.com'
#description = u'News from Windsor, ON'
# un-comment the following four lines for the Windsor Star
## title = u'Windsor Star'
## url_prefix = 'http://www.windsorstar.com'
## description = u'News from Windsor, ON'
## fp_tag = 'CAN_'
# un-comment the following three lines for the Ottawa Citizen
#title = u'Ottawa Citizen'
#url_prefix = 'http://www.ottawacitizen.com'
#description = u'News from Ottawa, ON'
# un-comment the following four lines for the Ottawa Citizen
## title = u'Ottawa Citizen'
## url_prefix = 'http://www.ottawacitizen.com'
## description = u'News from Ottawa, ON'
## fp_tag = 'CAN_OC'
# un-comment the following three lines for the Montreal Gazette
#title = u'Montreal Gazette'
#url_prefix = 'http://www.montrealgazette.com'
#description = u'News from Montreal, QC'
# un-comment the following four lines for the Montreal Gazette
## title = u'Montreal Gazette'
## url_prefix = 'http://www.montrealgazette.com'
## description = u'News from Montreal, QC'
## fp_tag = 'CAN_MG'
language = 'en_CA'
@ -83,14 +96,80 @@ class CanWestPaper(BasicNewsRecipe):
dict(name='div', attrs={'class':'rule_grey_solid'}),
dict(name='li', attrs={'class':'print'}),dict(name='li', attrs={'class':'share'}),dict(name='ul', attrs={'class':'bullet'})]
def preprocess_html(self,soup):
#delete iempty id attributes--they screw up the TOC for unknow reasons
divtags = soup.findAll('div',attrs={'id':''})
if divtags:
for div in divtags:
del(div['id'])
def get_cover_url(self):
from datetime import timedelta, date
if self.fp_tag=='':
return None
cover = 'http://webmedia.newseum.org/newseum-multimedia/dfp/jpg'+str(date.today().day)+'/lg/'+self.fp_tag+'.jpg'
br = BasicNewsRecipe.get_browser()
daysback=1
try:
br.open(cover)
except:
while daysback<7:
cover = 'http://webmedia.newseum.org/newseum-multimedia/dfp/jpg'+str((date.today() - timedelta(days=daysback)).day)+'/lg/'+self.fp_tag+'.jpg'
br = BasicNewsRecipe.get_browser()
try:
br.open(cover)
except:
daysback = daysback+1
continue
break
if daysback==7:
self.log("\nCover unavailable")
cover = None
return cover
def fixChars(self,string):
# Replace lsquo (\x91)
fixed = re.sub("\x91","",string)
# Replace rsquo (\x92)
fixed = re.sub("\x92","",fixed)
# Replace ldquo (\x93)
fixed = re.sub("\x93","“",fixed)
# Replace rdquo (\x94)
fixed = re.sub("\x94","”",fixed)
# Replace ndash (\x96)
fixed = re.sub("\x96","",fixed)
# Replace mdash (\x97)
fixed = re.sub("\x97","—",fixed)
fixed = re.sub("&#x2019;","",fixed)
return fixed
def massageNCXText(self, description):
# Kindle TOC descriptions won't render certain characters
if description:
massaged = unicode(BeautifulStoneSoup(description, convertEntities=BeautifulStoneSoup.HTML_ENTITIES))
# Replace '&' with '&'
massaged = re.sub("&","&", massaged)
return self.fixChars(massaged)
else:
return description
def populate_article_metadata(self, article, soup, first):
if first:
picdiv = soup.find('body').find('img')
if picdiv is not None:
self.add_toc_thumbnail(article,re.sub(r'links\\link\d+\\','',picdiv['src']))
xtitle = article.text_summary.strip()
if len(xtitle) == 0:
desc = soup.find('meta',attrs={'property':'og:description'})
if desc is not None:
article.summary = article.text_summary = desc['content']
def strip_anchors(self,soup):
paras = soup.findAll(True)
for para in paras:
aTags = para.findAll('a')
for a in aTags:
if a.img is None:
a.replaceWith(a.renderContents().decode('cp1252','replace'))
return soup
def preprocess_html(self, soup):
return self.strip_anchors(soup)
def parse_index(self):
soup = self.index_to_soup(self.url_prefix+'/news/todays-paper/index.html')

221
recipes/windsor_star.recipe Normal file
View File

@ -0,0 +1,221 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
__license__ = 'GPL v3'
'''
www.canada.com
'''
import re
from calibre.web.feeds.news import BasicNewsRecipe
from calibre.ebooks.BeautifulSoup import BeautifulStoneSoup
class CanWestPaper(BasicNewsRecipe):
# un-comment the following four lines for the Victoria Times Colonist
## title = u'Victoria Times Colonist'
## url_prefix = 'http://www.timescolonist.com'
## description = u'News from Victoria, BC'
## fp_tag = 'CAN_TC'
# un-comment the following four lines for the Vancouver Province
## title = u'Vancouver Province'
## url_prefix = 'http://www.theprovince.com'
## description = u'News from Vancouver, BC'
## fp_tag = 'CAN_VP'
# un-comment the following four lines for the Vancouver Sun
## title = u'Vancouver Sun'
## url_prefix = 'http://www.vancouversun.com'
## description = u'News from Vancouver, BC'
## fp_tag = 'CAN_VS'
# un-comment the following four lines for the Edmonton Journal
## title = u'Edmonton Journal'
## url_prefix = 'http://www.edmontonjournal.com'
## description = u'News from Edmonton, AB'
## fp_tag = 'CAN_EJ'
# un-comment the following four lines for the Calgary Herald
## title = u'Calgary Herald'
## url_prefix = 'http://www.calgaryherald.com'
## description = u'News from Calgary, AB'
## fp_tag = 'CAN_CH'
# un-comment the following four lines for the Regina Leader-Post
## title = u'Regina Leader-Post'
## url_prefix = 'http://www.leaderpost.com'
## description = u'News from Regina, SK'
## fp_tag = ''
# un-comment the following four lines for the Saskatoon Star-Phoenix
## title = u'Saskatoon Star-Phoenix'
## url_prefix = 'http://www.thestarphoenix.com'
## description = u'News from Saskatoon, SK'
## fp_tag = ''
# un-comment the following four lines for the Windsor Star
title = u'Windsor Star'
url_prefix = 'http://www.windsorstar.com'
description = u'News from Windsor, ON'
fp_tag = 'CAN_'
# un-comment the following four lines for the Ottawa Citizen
## title = u'Ottawa Citizen'
## url_prefix = 'http://www.ottawacitizen.com'
## description = u'News from Ottawa, ON'
## fp_tag = 'CAN_OC'
# un-comment the following four lines for the Montreal Gazette
## title = u'Montreal Gazette'
## url_prefix = 'http://www.montrealgazette.com'
## description = u'News from Montreal, QC'
## fp_tag = 'CAN_MG'
language = 'en_CA'
__author__ = 'Nick Redding'
no_stylesheets = True
timefmt = ' [%b %d]'
extra_css = '''
.timestamp { font-size:xx-small; display: block; }
#storyheader { font-size: medium; }
#storyheader h1 { font-size: x-large; }
#storyheader h2 { font-size: large; font-style: italic; }
.byline { font-size:xx-small; }
#photocaption { font-size: small; font-style: italic }
#photocredit { font-size: xx-small; }'''
keep_only_tags = [dict(name='div', attrs={'id':'storyheader'}),dict(name='div', attrs={'id':'storycontent'})]
remove_tags = [{'class':'comments'},
dict(name='div', attrs={'class':'navbar'}),dict(name='div', attrs={'class':'morelinks'}),
dict(name='div', attrs={'class':'viewmore'}),dict(name='li', attrs={'class':'email'}),
dict(name='div', attrs={'class':'story_tool_hr'}),dict(name='div', attrs={'class':'clear'}),
dict(name='div', attrs={'class':'story_tool'}),dict(name='div', attrs={'class':'copyright'}),
dict(name='div', attrs={'class':'rule_grey_solid'}),
dict(name='li', attrs={'class':'print'}),dict(name='li', attrs={'class':'share'}),dict(name='ul', attrs={'class':'bullet'})]
def get_cover_url(self):
from datetime import timedelta, date
if self.fp_tag=='':
return None
cover = 'http://webmedia.newseum.org/newseum-multimedia/dfp/jpg'+str(date.today().day)+'/lg/'+self.fp_tag+'.jpg'
br = BasicNewsRecipe.get_browser()
daysback=1
try:
br.open(cover)
except:
while daysback<7:
cover = 'http://webmedia.newseum.org/newseum-multimedia/dfp/jpg'+str((date.today() - timedelta(days=daysback)).day)+'/lg/'+self.fp_tag+'.jpg'
br = BasicNewsRecipe.get_browser()
try:
br.open(cover)
except:
daysback = daysback+1
continue
break
if daysback==7:
self.log("\nCover unavailable")
cover = None
return cover
def fixChars(self,string):
# Replace lsquo (\x91)
fixed = re.sub("\x91","",string)
# Replace rsquo (\x92)
fixed = re.sub("\x92","",fixed)
# Replace ldquo (\x93)
fixed = re.sub("\x93","“",fixed)
# Replace rdquo (\x94)
fixed = re.sub("\x94","”",fixed)
# Replace ndash (\x96)
fixed = re.sub("\x96","",fixed)
# Replace mdash (\x97)
fixed = re.sub("\x97","—",fixed)
fixed = re.sub("&#x2019;","",fixed)
return fixed
def massageNCXText(self, description):
# Kindle TOC descriptions won't render certain characters
if description:
massaged = unicode(BeautifulStoneSoup(description, convertEntities=BeautifulStoneSoup.HTML_ENTITIES))
# Replace '&' with '&'
massaged = re.sub("&","&", massaged)
return self.fixChars(massaged)
else:
return description
def populate_article_metadata(self, article, soup, first):
if first:
picdiv = soup.find('body').find('img')
if picdiv is not None:
self.add_toc_thumbnail(article,re.sub(r'links\\link\d+\\','',picdiv['src']))
xtitle = article.text_summary.strip()
if len(xtitle) == 0:
desc = soup.find('meta',attrs={'property':'og:description'})
if desc is not None:
article.summary = article.text_summary = desc['content']
def strip_anchors(self,soup):
paras = soup.findAll(True)
for para in paras:
aTags = para.findAll('a')
for a in aTags:
if a.img is None:
a.replaceWith(a.renderContents().decode('cp1252','replace'))
return soup
def preprocess_html(self, soup):
return self.strip_anchors(soup)
def parse_index(self):
soup = self.index_to_soup(self.url_prefix+'/news/todays-paper/index.html')
articles = {}
key = 'News'
ans = ['News']
# Find each instance of class="sectiontitle", class="featurecontent"
for divtag in soup.findAll('div',attrs={'class' : ["section_title02","featurecontent"]}):
#self.log(" div class = %s" % divtag['class'])
if divtag['class'].startswith('section_title'):
# div contains section title
if not divtag.h3:
continue
key = self.tag_to_string(divtag.h3,False)
ans.append(key)
self.log("Section name %s" % key)
continue
# div contains article data
h1tag = divtag.find('h1')
if not h1tag:
continue
atag = h1tag.find('a',href=True)
if not atag:
continue
url = self.url_prefix+'/news/todays-paper/'+atag['href']
#self.log("Section %s" % key)
#self.log("url %s" % url)
title = self.tag_to_string(atag,False)
#self.log("title %s" % title)
pubdate = ''
description = ''
ptag = divtag.find('p');
if ptag:
description = self.tag_to_string(ptag,False)
#self.log("description %s" % description)
author = ''
autag = divtag.find('h4')
if autag:
author = self.tag_to_string(autag,False)
#self.log("author %s" % author)
if not articles.has_key(key):
articles[key] = []
articles[key].append(dict(title=title,url=url,date=pubdate,description=description,author=author,content=''))
ans = [(key, articles[key]) for key in ans if articles.has_key(key)]
return ans

View File

@ -156,9 +156,6 @@ class Develop(Command):
self.warn('Failed to compile mount helper. Auto mounting of',
' devices will not work')
if not isbsd and os.geteuid() != 0:
return self.warn('Must be run as root to compile mount helper. Auto '
'mounting of devices will not work.')
src = os.path.join(self.SRC, 'calibre', 'devices', 'linux_mount_helper.c')
dest = os.path.join(self.staging_bindir, 'calibre-mount-helper')
self.info('Installing mount helper to '+ dest)

View File

@ -8,14 +8,14 @@ msgstr ""
"Project-Id-Version: calibre\n"
"Report-Msgid-Bugs-To: FULL NAME <EMAIL@ADDRESS>\n"
"POT-Creation-Date: 2011-11-25 14:01+0000\n"
"PO-Revision-Date: 2012-01-28 05:12+0000\n"
"PO-Revision-Date: 2012-02-09 02:26+0000\n"
"Last-Translator: Vibhav Pant <vibhavp@gmail.com>\n"
"Language-Team: English (United Kingdom) <en_GB@li.org>\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"
"X-Launchpad-Export-Date: 2012-01-29 05:21+0000\n"
"X-Generator: Launchpad (build 14727)\n"
"X-Launchpad-Export-Date: 2012-02-09 05:45+0000\n"
"X-Generator: Launchpad (build 14763)\n"
#. name for aaa
msgid "Ghotuo"
@ -7083,323 +7083,323 @@ msgstr "Ekari"
#. name for eki
msgid "Eki"
msgstr ""
msgstr "Eki"
#. name for ekk
msgid "Estonian; Standard"
msgstr ""
msgstr "Estonian; Standard"
#. name for ekl
msgid "Kol"
msgstr ""
msgstr "Kol"
#. name for ekm
msgid "Elip"
msgstr ""
msgstr "Elip"
#. name for eko
msgid "Koti"
msgstr ""
msgstr "Koti"
#. name for ekp
msgid "Ekpeye"
msgstr ""
msgstr "Ekpeye"
#. name for ekr
msgid "Yace"
msgstr ""
msgstr "Yace"
#. name for eky
msgid "Kayah; Eastern"
msgstr ""
msgstr "Kayah; Eastern"
#. name for ele
msgid "Elepi"
msgstr ""
msgstr "Elepi"
#. name for elh
msgid "El Hugeirat"
msgstr ""
msgstr "El Hugeirat"
#. name for eli
msgid "Nding"
msgstr ""
msgstr "Nding"
#. name for elk
msgid "Elkei"
msgstr ""
msgstr "Elkei"
#. name for ell
msgid "Greek; Modern (1453-)"
msgstr ""
msgstr "Greek; Modern (1453-)"
#. name for elm
msgid "Eleme"
msgstr ""
msgstr "Eleme"
#. name for elo
msgid "El Molo"
msgstr ""
msgstr "El Molo"
#. name for elp
msgid "Elpaputih"
msgstr ""
msgstr "Elpaputih"
#. name for elu
msgid "Elu"
msgstr ""
msgstr "Elu"
#. name for elx
msgid "Elamite"
msgstr ""
msgstr "Elamite"
#. name for ema
msgid "Emai-Iuleha-Ora"
msgstr ""
msgstr "Emai-Iuleha-Ora"
#. name for emb
msgid "Embaloh"
msgstr ""
msgstr "Embaloh"
#. name for eme
msgid "Emerillon"
msgstr ""
msgstr "Emerillon"
#. name for emg
msgid "Meohang; Eastern"
msgstr ""
msgstr "Meohang; Eastern"
#. name for emi
msgid "Mussau-Emira"
msgstr ""
msgstr "Mussau-Emira"
#. name for emk
msgid "Maninkakan; Eastern"
msgstr ""
msgstr "Maninkakan; Eastern"
#. name for emm
msgid "Mamulique"
msgstr ""
msgstr "Mamulique"
#. name for emn
msgid "Eman"
msgstr ""
msgstr "Eman"
#. name for emo
msgid "Emok"
msgstr ""
msgstr "Emok"
#. name for emp
msgid "Emberá; Northern"
msgstr ""
msgstr "Emberá; Northern"
#. name for ems
msgid "Yupik; Pacific Gulf"
msgstr ""
msgstr "Yupik; Pacific Gulf"
#. name for emu
msgid "Muria; Eastern"
msgstr ""
msgstr "Muria; Eastern"
#. name for emw
msgid "Emplawas"
msgstr ""
msgstr "Emplawas"
#. name for emx
msgid "Erromintxela"
msgstr ""
msgstr "Erromintxela"
#. name for emy
msgid "Mayan; Epigraphic"
msgstr ""
msgstr "Mayan; Epigraphic"
#. name for ena
msgid "Apali"
msgstr ""
msgstr "Apali"
#. name for enb
msgid "Markweeta"
msgstr ""
msgstr "Markweeta"
#. name for enc
msgid "En"
msgstr ""
msgstr "En"
#. name for end
msgid "Ende"
msgstr ""
msgstr "Ende"
#. name for enf
msgid "Enets; Forest"
msgstr ""
msgstr "Enets; Forest"
#. name for eng
msgid "English"
msgstr ""
msgstr "English"
#. name for enh
msgid "Enets; Tundra"
msgstr ""
msgstr "Enets; Tundra"
#. name for enm
msgid "English; Middle (1100-1500)"
msgstr ""
msgstr "English; Middle (1100-1500)"
#. name for enn
msgid "Engenni"
msgstr ""
msgstr "Engenni"
#. name for eno
msgid "Enggano"
msgstr ""
msgstr "Enggano"
#. name for enq
msgid "Enga"
msgstr ""
msgstr "Enga"
#. name for enr
msgid "Emumu"
msgstr ""
msgstr "Emumu"
#. name for enu
msgid "Enu"
msgstr ""
msgstr "Enu"
#. name for env
msgid "Enwan (Edu State)"
msgstr ""
msgstr "Enwan (Edu State)"
#. name for enw
msgid "Enwan (Akwa Ibom State)"
msgstr ""
msgstr "Enwan (Akwa Ibom State)"
#. name for eot
msgid "Beti (Côte d'Ivoire)"
msgstr ""
msgstr "Beti (Côte d'Ivoire)"
#. name for epi
msgid "Epie"
msgstr ""
msgstr "Epie"
#. name for epo
msgid "Esperanto"
msgstr ""
msgstr "Esperanto"
#. name for era
msgid "Eravallan"
msgstr ""
msgstr "Eravallan"
#. name for erg
msgid "Sie"
msgstr ""
msgstr "Sie"
#. name for erh
msgid "Eruwa"
msgstr ""
msgstr "Eruwa"
#. name for eri
msgid "Ogea"
msgstr ""
msgstr "Ogea"
#. name for erk
msgid "Efate; South"
msgstr ""
msgstr "Efate; South"
#. name for ero
msgid "Horpa"
msgstr ""
msgstr "Horpa"
#. name for err
msgid "Erre"
msgstr ""
msgstr "Erre"
#. name for ers
msgid "Ersu"
msgstr ""
msgstr "Ersu"
#. name for ert
msgid "Eritai"
msgstr ""
msgstr "Eritai"
#. name for erw
msgid "Erokwanas"
msgstr ""
msgstr "Erokwanas"
#. name for ese
msgid "Ese Ejja"
msgstr ""
msgstr "Ese Ejja"
#. name for esh
msgid "Eshtehardi"
msgstr ""
msgstr "Eshtehardi"
#. name for esi
msgid "Inupiatun; North Alaskan"
msgstr ""
msgstr "Inupiatun; North Alaskan"
#. name for esk
msgid "Inupiatun; Northwest Alaska"
msgstr ""
msgstr "Inupiatun; Northwest Alaska"
#. name for esl
msgid "Egypt Sign Language"
msgstr ""
msgstr "Egypt Sign Language"
#. name for esm
msgid "Esuma"
msgstr ""
msgstr "Esuma"
#. name for esn
msgid "Salvadoran Sign Language"
msgstr ""
msgstr "Salvadoran Sign Language"
#. name for eso
msgid "Estonian Sign Language"
msgstr ""
msgstr "Estonian Sign Language"
#. name for esq
msgid "Esselen"
msgstr ""
msgstr "Esselen"
#. name for ess
msgid "Yupik; Central Siberian"
msgstr ""
msgstr "Yupik; Central Siberian"
#. name for est
msgid "Estonian"
msgstr ""
msgstr "Estonian"
#. name for esu
msgid "Yupik; Central"
msgstr ""
msgstr "Yupik; Central"
#. name for etb
msgid "Etebi"
msgstr ""
msgstr "Etebi"
#. name for etc
msgid "Etchemin"
msgstr ""
msgstr "Etchemin"
#. name for eth
msgid "Ethiopian Sign Language"
msgstr ""
msgstr "Ethiopian Sign Language"
#. name for etn
msgid "Eton (Vanuatu)"
msgstr ""
msgstr "Eton (Vanuatu)"
#. name for eto
msgid "Eton (Cameroon)"
msgstr ""
msgstr "Eton (Cameroon)"
#. name for etr
msgid "Edolo"
msgstr ""
msgstr "Edolo"
#. name for ets
msgid "Yekhee"
msgstr ""
msgstr "Yekhee"
#. name for ett
msgid "Etruscan"

View File

@ -4,7 +4,7 @@ __license__ = 'GPL v3'
__copyright__ = '2008, Kovid Goyal kovid@kovidgoyal.net'
__docformat__ = 'restructuredtext en'
__appname__ = u'calibre'
numeric_version = (0, 8, 38)
numeric_version = (0, 8, 39)
__version__ = u'.'.join(map(unicode, numeric_version))
__author__ = u"Kovid Goyal <kovid@kovidgoyal.net>"

View File

@ -449,7 +449,7 @@ class CatalogPlugin(Plugin): # {{{
['author_sort','authors','comments','cover','formats',
'id','isbn','ondevice','pubdate','publisher','rating',
'series_index','series','size','tags','timestamp',
'title_sort','title','uuid','languages'])
'title_sort','title','uuid','languages','identifiers'])
all_custom_fields = set(db.custom_field_keys())
for field in list(all_custom_fields):
fm = db.field_metadata[field]

View File

@ -38,6 +38,7 @@ class ANDROID(USBMS):
0xca4 : [0x100, 0x0227, 0x0226, 0x222],
0xca9 : [0x100, 0x0227, 0x0226, 0x222],
0xcac : [0x100, 0x0227, 0x0226, 0x222],
0xccf : [0x100, 0x0227, 0x0226, 0x222],
0x2910 : [0x222],
},
@ -52,6 +53,7 @@ class ANDROID(USBMS):
0x70c6 : [0x226],
0x4316 : [0x216],
0x42d6 : [0x216],
0x42d7 : [0x216],
},
# Freescale
0x15a2 : {
@ -99,6 +101,7 @@ class ANDROID(USBMS):
0xc001 : [0x0226],
0xc004 : [0x0226],
0x8801 : [0x0226, 0x0227],
0xe115 : [0x0216], # PocketBook A10
},
# Acer
@ -163,7 +166,8 @@ class ANDROID(USBMS):
'GT-I5700', 'SAMSUNG', 'DELL', 'LINUX', 'GOOGLE', 'ARCHOS',
'TELECHIP', 'HUAWEI', 'T-MOBILE', 'SEMC', 'LGE', 'NVIDIA',
'GENERIC-', 'ZTE', 'MID', 'QUALCOMM', 'PANDIGIT', 'HYSTON',
'VIZIO', 'GOOGLE', 'FREESCAL', 'KOBO_INC', 'LENOVO', 'ROCKCHIP']
'VIZIO', 'GOOGLE', 'FREESCAL', 'KOBO_INC', 'LENOVO', 'ROCKCHIP',
'POCKET']
WINDOWS_MAIN_MEM = ['ANDROID_PHONE', 'A855', 'A853', 'INC.NEXUS_ONE',
'__UMS_COMPOSITE', '_MB200', 'MASS_STORAGE', '_-_CARD', 'SGH-I897',
'GT-I9000', 'FILE-STOR_GADGET', 'SGH-T959', 'SAMSUNG_ANDROID',
@ -176,13 +180,14 @@ class ANDROID(USBMS):
'GT-S5830_CARD', 'GT-S5570_CARD', 'MB870', 'MID7015A',
'ALPANDIGITAL', 'ANDROID_MID', 'VTAB1008', 'EMX51_BBG_ANDROI',
'UMS', '.K080', 'P990', 'LTE', 'MB853', 'GT-S5660_CARD', 'A107',
'GT-I9003_CARD', 'XT912', 'FILE-CD_GADGET', 'RK29_SDK', 'MB855']
'GT-I9003_CARD', 'XT912', 'FILE-CD_GADGET', 'RK29_SDK', 'MB855',
'XT910', 'BOOK_A10']
WINDOWS_CARD_A_MEM = ['ANDROID_PHONE', 'GT-I9000_CARD', 'SGH-I897',
'FILE-STOR_GADGET', 'SGH-T959', 'SAMSUNG_ANDROID', 'GT-P1000_CARD',
'A70S', 'A101IT', '7', 'INCREDIBLE', 'A7EB', 'SGH-T849_CARD',
'__UMS_COMPOSITE', 'SGH-I997_CARD', 'MB870', 'ALPANDIGITAL',
'ANDROID_MID', 'P990_SD_CARD', '.K080', 'LTE_CARD', 'MB853',
'A1-07___C0541A4F', 'XT912', 'MB855']
'A1-07___C0541A4F', 'XT912', 'MB855', 'XT910', 'BOOK_A10_CARD']
OSX_MAIN_MEM = 'Android Device Main Memory'

View File

@ -76,7 +76,7 @@ class E52(USBMS):
supported_platforms = ['windows', 'linux', 'osx']
VENDOR_ID = [0x421]
PRODUCT_ID = [0x1CD, 0x273]
PRODUCT_ID = [0x1CD, 0x273, 0x00aa]
BCD = [0x100]
@ -86,5 +86,5 @@ class E52(USBMS):
SUPPORTS_SUB_DIRS = True
VENDOR_NAME = 'NOKIA'
WINDOWS_MAIN_MEM = 'S60'
WINDOWS_MAIN_MEM = ['S60', 'E71']

View File

@ -177,7 +177,7 @@ class RTFInput(InputFormatPlugin):
font_size_classes = ['span.fs%d { font-size: %spt }'%(i, x) for i, x in
enumerate(ic.font_sizes)]
color_classes = ['span.col%d { color: %s }'%(i, x) for i, x in
enumerate(ic.colors)]
enumerate(ic.colors) if x != 'false']
css = textwrap.dedent('''
span.none {
text-decoration: none; font-weight: normal;

View File

@ -516,6 +516,17 @@ class MobiReader(object):
self.processed_html = re.sub(r'(?i)(?P<para></p[^>]*>)\s*(?P<styletags>(</(h\d+|i|b|u|em|small|big|strong|tt)>\s*){1,})', '\g<styletags>'+'\g<para>', self.processed_html)
self.processed_html = re.sub(r'(?i)(?P<blockquote>(</(blockquote|div)[^>]*>\s*){1,})(?P<para></p[^>]*>)', '\g<para>'+'\g<blockquote>', self.processed_html)
self.processed_html = re.sub(r'(?i)(?P<para><p[^>]*>)\s*(?P<blockquote>(<(blockquote|div)[^>]*>\s*){1,})', '\g<blockquote>'+'\g<para>', self.processed_html)
bods = htmls = 0
for x in re.finditer(ur'</body>|</html>', self.processed_html):
if x == '</body>': bods +=1
else: htmls += 1
if bods > 1 and htmls > 1:
break
if bods > 1:
self.processed_html = self.processed_html.replace('</body>', '')
if htmls > 1:
self.processed_html = self.processed_html.replace('</html>', '')
def remove_random_bytes(self, html):

View File

@ -87,6 +87,7 @@ gprefs.defaults['toolbar_text'] = 'always'
gprefs.defaults['font'] = None
gprefs.defaults['tags_browser_partition_method'] = 'first letter'
gprefs.defaults['tags_browser_collapse_at'] = 100
gprefs.defaults['tag_browser_dont_collapse'] = []
gprefs.defaults['edit_metadata_single_layout'] = 'default'
gprefs.defaults['book_display_fields'] = [
('title', False), ('authors', True), ('formats', True),

View File

@ -382,8 +382,8 @@ class Adder(QObject): # {{{
if not duplicates:
return self.duplicates_processed()
self.pd.hide()
files = [_('%s by %s')%(x[0].title, x[0].format_field('authors')[1])
for x in duplicates]
files = [_('%(title)s by %(author)s')%dict(title=x[0].title,
author=x[0].format_field('authors')[1]) for x in duplicates]
if question_dialog(self._parent, _('Duplicates found!'),
_('Books with the same title as the following already '
'exist in the database. Add them anyway?'),

View File

@ -209,8 +209,8 @@ class AutoAdder(QObject):
paths.extend(p)
formats.extend(f)
metadata.extend(mis)
files = [_('%s by %s')%(mi.title, mi.format_field('authors')[1])
for mi in metadata]
files = [_('%(title)s by %(author)s')%dict(title=mi.title,
author=mi.format_field('authors')[1]) for mi in metadata]
if question_dialog(self.parent(), _('Duplicates found!'),
_('Books with the same title as the following already '
'exist in the database. Add them anyway?'),
@ -228,8 +228,8 @@ class AutoAdder(QObject):
if count > 0:
m.books_added(count)
gui.status_bar.show_message(_(
'Added %d book(s) automatically from %s') %
(count, self.worker.path), 2000)
'Added %(num)d book(s) automatically from %(src)s') %
dict(num=count, src=self.worker.path), 2000)
if hasattr(gui, 'db_images'):
gui.db_images.reset()

View File

@ -180,7 +180,7 @@ class ProceedNotification(MessageBox): # {{{
self.payload = payload
self.html_log = html_log
self.log_viewer_title = log_viewer_title
self.finished.connect(self.do_proceed, type=Qt.QueuedConnection)
self.finished.connect(self.do_proceed)
self.vlb = self.bb.addButton(_('View log'), self.bb.ActionRole)
self.vlb.setIcon(QIcon(I('debug.png')))
@ -195,18 +195,17 @@ class ProceedNotification(MessageBox): # {{{
parent=self)
def do_proceed(self, result):
try:
if result == self.Accepted:
self.callback(self.payload)
elif self.cancel_callback is not None:
self.cancel_callback(self.payload)
finally:
# Ensure this notification is garbage collected
self.callback = self.cancel_callback = None
self.setParent(None)
self.finished.disconnect()
self.vlb.clicked.disconnect()
_proceed_memory.remove(self)
from calibre.gui2.ui import get_gui
func = (self.callback if result == self.Accepted else
self.cancel_callback)
gui = get_gui()
gui.proceed_requested.emit(func, self.payload)
# Ensure this notification is garbage collected
self.callback = self.cancel_callback = self.payload = None
self.setParent(None)
self.finished.disconnect()
self.vlb.clicked.disconnect()
_proceed_memory.remove(self)
# }}}
class ErrorNotification(MessageBox): # {{{

View File

@ -116,7 +116,7 @@
<item row="0" column="1">
<widget class="QLineEdit" name="title">
<property name="toolTip">
<string>Regular expression (?P&amp;lt;title&amp;gt;)</string>
<string>Regular expression (?P&lt;title&gt;)</string>
</property>
<property name="text">
<string>No match</string>

View File

@ -1073,19 +1073,40 @@ class DeviceBooksModel(BooksModel): # {{{
self.book_in_library = None
def mark_for_deletion(self, job, rows, rows_are_ids=False):
db_indices = rows if rows_are_ids else self.indices(rows)
db_items = [self.db[i] for i in db_indices if -1 < i < len(self.db)]
self.marked_for_deletion[job] = db_items
if rows_are_ids:
self.marked_for_deletion[job] = rows
self.reset()
else:
self.marked_for_deletion[job] = self.indices(rows)
for row in rows:
indices = self.row_indices(row)
self.dataChanged.emit(indices[0], indices[-1])
def find_item_in_db(self, item):
idx = None
try:
idx = self.db.index(item)
except:
path = getattr(item, 'path', None)
if path:
for i, x in enumerate(self.db):
if getattr(x, 'path', None) == path:
idx = i
break
return idx
def deletion_done(self, job, succeeded=True):
if not self.marked_for_deletion.has_key(job):
return
rows = self.marked_for_deletion.pop(job)
db_items = self.marked_for_deletion.pop(job, [])
rows = []
for item in db_items:
idx = self.find_item_in_db(item)
if idx is not None:
try:
rows.append(self.map.index(idx))
except ValueError:
pass
for row in rows:
if not succeeded:
indices = self.row_indices(self.index(row, 0))
@ -1096,11 +1117,18 @@ class DeviceBooksModel(BooksModel): # {{{
self.resort(False)
self.research(True)
def indices_to_be_deleted(self):
ans = []
for v in self.marked_for_deletion.values():
ans.extend(v)
return ans
def is_row_marked_for_deletion(self, row):
try:
item = self.db[self.map[row]]
except IndexError:
return False
path = getattr(item, 'path', None)
for items in self.marked_for_deletion.itervalues():
for x in items:
if x is item or (path and path == getattr(x, 'path', None)):
return True
return False
def clear_ondevice(self, db_ids, to_what=None):
for data in self.db:
@ -1112,8 +1140,8 @@ class DeviceBooksModel(BooksModel): # {{{
self.reset()
def flags(self, index):
if self.map[index.row()] in self.indices_to_be_deleted():
return Qt.ItemIsUserCheckable # Can't figure out how to get the disabled flag in python
if self.is_row_marked_for_deletion(index.row()):
return Qt.NoItemFlags
flags = QAbstractTableModel.flags(self, index)
if index.isValid():
cname = self.column_map[index.column()]
@ -1347,7 +1375,7 @@ class DeviceBooksModel(BooksModel): # {{{
elif DEBUG and cname == 'inlibrary':
return QVariant(self.db[self.map[row]].in_library)
elif role == Qt.ToolTipRole and index.isValid():
if self.map[row] in self.indices_to_be_deleted():
if self.is_row_marked_for_deletion(row):
return QVariant(_('Marked for deletion'))
if cname in ['title', 'authors'] or (cname == 'collections' and \
self.db.supports_collections()):

View File

@ -280,14 +280,12 @@ class CreateCustomColumn(QDialog, Ui_QCreateCustomColumn):
if not unicode(self.enum_box.text()).strip():
return self.simple_error('', _('You must enter at least one'
' value for enumeration columns'))
l = [v.strip() for v in unicode(self.enum_box.text()).split(',')]
if '' in l:
return self.simple_error('', _('You cannot provide the empty '
'value, as it is included by default'))
for i in range(0, len(l)-1):
if l[i] in l[i+1:]:
l = [v.strip() for v in unicode(self.enum_box.text()).split(',') if v.strip()]
l_lower = [v.lower() for v in l]
for i,v in enumerate(l_lower):
if v in l_lower[i+1:]:
return self.simple_error('', _('The value "{0}" is in the '
'list more than once').format(l[i]))
'list more than once, perhaps with different case').format(l[i]))
c = unicode(self.enum_colors.text())
if c:
c = [v.strip() for v in unicode(self.enum_colors.text()).split(',')]

View File

@ -144,6 +144,7 @@ class ConfigWidget(ConfigWidgetBase, Ui_Form):
r('tags_browser_partition_method', gprefs, choices=choices)
r('tags_browser_collapse_at', gprefs)
r('default_author_link', gprefs)
r('tag_browser_dont_collapse', gprefs, setting=CommaSeparatedList)
choices = set([k for k in db.field_metadata.all_field_keys()
if (db.field_metadata[k]['is_category'] and

View File

@ -259,9 +259,9 @@
<widget class="QLineEdit" name="opt_default_author_link">
<property name="toolTip">
<string>&lt;p&gt;Enter a template to be used to create a link for
an author in the books information dialog. This template will
be used when no link has been provided for the author using
Manage Authors. You can use the values {author} and
an author in the books information dialog. This template will
be used when no link has been provided for the author using
Manage Authors. You can use the values {author} and
{author_sort}, and any template function.</string>
</property>
</widget>
@ -334,14 +334,35 @@ if you never want subcategories</string>
<widget class="QSpinBox" name="opt_tags_browser_collapse_at">
<property name="toolTip">
<string>If a Tag Browser category has more than this number of items, it is divided
up into sub-categories. If the partition method is set to disable, this value is ignored.</string>
up into subcategories. If the partition method is set to disable, this value is ignored.</string>
</property>
<property name="maximum">
<number>10000</number>
</property>
</widget>
</item>
<item row="1" column="0" colspan="5">
<item row="1" column="2">
<widget class="QLabel" name="label_8111">
<property name="text">
<string>Categories not to partition:</string>
</property>
<property name="buddy">
<cstring>opt_tag_browser_dont_collapse</cstring>
</property>
</widget>
</item>
<item row="1" column="3" colspan="2">
<widget class="MultiCompleteLineEdit" name="opt_tag_browser_dont_collapse">
<property name="toolTip">
<string>A comma-separated list of categories that are not to
be partitioned even if the number of items is larger than
the value shown above. This option can be used to
avoid collapsing hierarchical categories that have only
a few top-level elements.</string>
</property>
</widget>
</item>
<item row="2" column="0" colspan="5">
<widget class="QCheckBox" name="opt_show_avg_rating">
<property name="text">
<string>Show &amp;average ratings in the tags browser</string>
@ -351,7 +372,7 @@ up into sub-categories. If the partition method is set to disable, this value is
</property>
</widget>
</item>
<item row="2" column="0">
<item row="3" column="0">
<widget class="QLabel" name="label_81">
<property name="text">
<string>Categories with &amp;hierarchical items:</string>
@ -361,7 +382,7 @@ up into sub-categories. If the partition method is set to disable, this value is
</property>
</widget>
</item>
<item row="3" column="0" colspan="5">
<item row="4" column="0" colspan="5">
<spacer name="verticalSpacer_2">
<property name="orientation">
<enum>Qt::Vertical</enum>
@ -374,10 +395,10 @@ up into sub-categories. If the partition method is set to disable, this value is
</property>
</spacer>
</item>
<item row="2" column="2" colspan="3">
<item row="3" column="2" colspan="3">
<widget class="MultiCompleteLineEdit" name="opt_categories_using_hierarchy">
<property name="toolTip">
<string>A comma-separated list of columns in which items containing
<string>A comma-separated list of categories in which items containing
periods are displayed in the tag browser trees. For example, if
this box contains 'tags' then tags of the form 'Mystery.English'
and 'Mystery.Thriller' will be displayed with English and Thriller

View File

@ -36,6 +36,7 @@ class ConfigWidget(ConfigWidgetBase, Ui_Form):
r('max_cover', self.proxy)
r('max_opds_items', self.proxy)
r('max_opds_ungrouped_items', self.proxy)
r('url_prefix', self.proxy)
self.show_server_password.stateChanged[int].connect(
lambda s: self.opt_password.setEchoMode(
@ -100,7 +101,8 @@ class ConfigWidget(ConfigWidgetBase, Ui_Form):
self.stopping_msg.accept()
def test_server(self):
open_url(QUrl('http://127.0.0.1:'+str(self.opt_port.value())))
prefix = unicode(self.opt_url_prefix.text()).strip()
open_url(QUrl('http://127.0.0.1:'+str(self.opt_port.value())+prefix))
def view_server_logs(self):
from calibre.library.server import log_access_file, log_error_file

View File

@ -16,36 +16,6 @@
<layout class="QVBoxLayout" name="verticalLayout">
<item>
<layout class="QGridLayout" name="gridLayout_5">
<item row="0" column="0">
<widget class="QLabel" name="label_10">
<property name="text">
<string>Server &amp;port:</string>
</property>
<property name="buddy">
<cstring>opt_port</cstring>
</property>
</widget>
</item>
<item row="0" column="1" colspan="2">
<widget class="QSpinBox" name="opt_port">
<property name="maximum">
<number>65535</number>
</property>
<property name="value">
<number>8080</number>
</property>
</widget>
</item>
<item row="1" column="0">
<widget class="QLabel" name="label_11">
<property name="text">
<string>&amp;Username:</string>
</property>
<property name="buddy">
<cstring>opt_username</cstring>
</property>
</widget>
</item>
<item row="1" column="1" colspan="2">
<widget class="QLineEdit" name="opt_username"/>
</item>
@ -91,6 +61,36 @@ Leave this blank if you intend to use the server with an
</property>
</widget>
</item>
<item row="0" column="0">
<widget class="QLabel" name="label_10">
<property name="text">
<string>Server &amp;port:</string>
</property>
<property name="buddy">
<cstring>opt_port</cstring>
</property>
</widget>
</item>
<item row="0" column="1" colspan="2">
<widget class="QSpinBox" name="opt_port">
<property name="maximum">
<number>65535</number>
</property>
<property name="value">
<number>8080</number>
</property>
</widget>
</item>
<item row="1" column="0">
<widget class="QLabel" name="label_11">
<property name="text">
<string>&amp;Username:</string>
</property>
<property name="buddy">
<cstring>opt_username</cstring>
</property>
</widget>
</item>
<item row="3" column="1" colspan="2">
<widget class="QCheckBox" name="show_server_password">
<property name="text">
@ -181,6 +181,23 @@ Leave this blank if you intend to use the server with an
</property>
</widget>
</item>
<item row="8" column="0">
<widget class="QLabel" name="label_2">
<property name="text">
<string>&amp;URL Prefix:</string>
</property>
<property name="buddy">
<cstring>opt_url_prefix</cstring>
</property>
</widget>
</item>
<item row="8" column="1" colspan="2">
<widget class="QLineEdit" name="opt_url_prefix">
<property name="toolTip">
<string>A prefix that is applied to all URLs in the content server. Useful only if you plan to put the server behind another server like Apache, with a reverse proxy.</string>
</property>
</widget>
</item>
</layout>
</item>
<item>

View File

@ -377,13 +377,15 @@ class TagsModel(QAbstractItemModel): # {{{
collapse_model = 'partition'
collapse_template = tweaks['categories_collapsed_popularity_template']
def process_one_node(category, state_map): # {{{
def process_one_node(category, collapse_model, state_map): # {{{
collapse_letter = None
category_node = category
key = category_node.category_key
is_gst = category_node.is_gst
if key not in data:
return
if key in gprefs['tag_browser_dont_collapse']:
collapse_model = 'disable'
cat_len = len(data[key])
if cat_len <= 0:
return
@ -523,7 +525,8 @@ class TagsModel(QAbstractItemModel): # {{{
# }}}
for category in self.category_nodes:
process_one_node(category, state_map.get(category.category_key, {}))
process_one_node(category, collapse_model,
state_map.get(category.category_key, {}))
def get_category_editor_data(self, category):
for cat in self.root_item.children:

View File

@ -102,10 +102,13 @@ class Main(MainWindow, MainWindowMixin, DeviceMixin, EmailMixin, # {{{
):
'The main GUI'
proceed_requested = pyqtSignal(object, object)
def __init__(self, opts, parent=None, gui_debug=None):
global _gui
MainWindow.__init__(self, opts, parent=parent, disable_automatic_gc=True)
self.proceed_requested.connect(self.do_proceed,
type=Qt.QueuedConnection)
self.keyboard = Manager(self)
_gui = self
self.opts = opts
@ -402,6 +405,10 @@ class Main(MainWindow, MainWindowMixin, DeviceMixin, EmailMixin, # {{{
except:
pass
def do_proceed(self, func, payload):
if callable(func):
func(payload)
def no_op(self, *args):
pass

View File

@ -701,7 +701,10 @@ class LibraryPage(QWizardPage, LibraryUI):
pass
def is_library_dir_suitable(self, x):
return LibraryDatabase2.exists_at(x) or not os.listdir(x)
try:
return LibraryDatabase2.exists_at(x) or not os.listdir(x)
except:
return False
def validatePage(self):
newloc = unicode(self.location.text())
@ -720,6 +723,13 @@ class LibraryPage(QWizardPage, LibraryUI):
_('Path to library too long. Must be less than'
' %d characters.')%LibraryDatabase2.WINDOWS_LIBRARY_PATH_LIMIT,
show=True)
if not os.path.exists(x):
try:
os.makedirs(x)
except:
return error_dialog(self, _('Bad location'),
_('Failed to create a folder at %s')%x,
det_msg=traceback.format_exc(), show=True)
if self.is_library_dir_suitable(x):
self.location.setText(x)

View File

@ -11,7 +11,7 @@ __docformat__ = 'restructuredtext en'
FIELDS = ['all', 'title', 'title_sort', 'author_sort', 'authors', 'comments',
'cover', 'formats','id', 'isbn', 'ondevice', 'pubdate', 'publisher',
'rating', 'series_index', 'series', 'size', 'tags', 'timestamp',
'uuid', 'languages']
'uuid', 'languages', 'identifiers']
#Allowed fields for template
TEMPLATE_ALLOWED_FIELDS = [ 'author_sort', 'authors', 'id', 'isbn', 'pubdate', 'title_sort',

View File

@ -162,7 +162,7 @@ class CSV_XML(CatalogPlugin):
record.append(item)
for field in ('id', 'uuid', 'publisher', 'rating', 'size',
'isbn','ondevice'):
'isbn','ondevice', 'identifiers'):
if field in fields:
val = r[field]
if not val:

View File

@ -79,6 +79,8 @@ class BonJour(SimplePlugin): # {{{
try:
publish_zeroconf('Books in calibre', '_stanza._tcp',
self.port, {'path':self.prefix+'/stanza'})
publish_zeroconf('Books in calibre', '_calibre._tcp',
self.port, {'path':self.prefix+'/opds'})
except:
import traceback
cherrypy.log.error('Failed to start BonJour:')
@ -122,6 +124,8 @@ class LibraryServer(ContentServer, MobileServer, XMLServer, OPDSServer, Cache,
path = P('content_server')
self.build_time = fromtimestamp(os.stat(path).st_mtime)
self.default_cover = open(P('content_server/default_cover.jpg'), 'rb').read()
if not opts.url_prefix:
opts.url_prefix = ''
cherrypy.engine.bonjour.port = opts.port
cherrypy.engine.bonjour.prefix = opts.url_prefix

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

19502
src/calibre/translations/is.po Normal file

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

Some files were not shown because too many files have changed in this diff Show More