Sync to trunk.

This commit is contained in:
John Schember 2013-02-16 22:45:07 -05:00
commit 241b9bea2c
145 changed files with 55831 additions and 41107 deletions

View File

@ -19,6 +19,49 @@
# new recipes:
# - title:
- version: 0.9.19
date: 2013-02-15
new features:
- title: "New tool: \"Polish books\" that allows you to perform various automated cleanup actions on EPUB and AZW3 files without doing a full conversion."
type: major
description: "Polishing books is all about putting the shine of perfection on your ebook files. You can use it to subset embedded fonts, update the metadata in the book files from the metadata in the calibre library, manipulate the book jacket, etc. More features will be added in the future. To use this tool, go to Preferences->Toolbar and add the Polish books tool to the main toolbar. Then simply select the books you want to be polished and click the Polish books button. Polishing, unlike conversion, does not change the internal structure/markup of your book, it performs only the minimal set of actions needed to achieve its goals. Note that polish books is a completely new codebase, so there may well be bugs, polishing a book backs up the original as ORIGINAL_EPUB or ORIGINAL_AZW3, unless you have turned off this feature in Preferences->Tweaks, in which case you should backup your files manually. You can also use this tool from the command line with ebook-polish.exe."
- title: "Driver for the Trekstor Pyrus Mini."
tickets: [1124120]
- title: "E-book viewer: Add an option to change the minimum font size."
tickets: [1122333]
- title: "PDF Output: Add support for converting documents with math typesetting, as described here: http://manual.calibre-ebook.com/typesetting_math.html"
- title: "Column coloring/icons: Add more conditions when using date based columns with reference to 'today'."
bug fixes:
- title: "Transforming to titlecase - handle typographic hyphens in all caps phrases"
- title: "Dont ignore file open events that occur before the GUI is initialized on OS X"
tickets: [1122713]
- title: "News download: Handle feeds that have entries with empty ids"
- title: "Fix a regression that broke using the template editor"
- title: "Do not block startup while scanning the computer for available network interfaces. Speeds up startup time on some windows computers with lots of spurious network interfaces."
improved recipes:
- New Yorker
- Kommersant
- Le Monde (Subscription version)
- NZ Herald
new recipes:
- title: Navegalo
author: Douglas Delgado
- title: El Guardian and More Intelligent Life
author: Darko Miletic
- version: 0.9.18
date: 2013-02-08

View File

@ -250,42 +250,71 @@ If you don't want to uninstall it altogether, there are a couple of tricks you c
simplest is to simply re-name the executable file that launches the library program. More detail
`in the forums <http://www.mobileread.com/forums/showthread.php?t=65809>`_.
How do I use |app| with my iPad/iPhone/iTouch?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
How do I use |app| with my iPad/iPhone/iPod touch?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Over the air
^^^^^^^^^^^^^^
The easiest way to browse your |app| collection on your Apple device (iPad/iPhone/iPod) is by using the calibre content server, which makes your collection available over the net. First perform the following steps in |app|
The easiest way to browse your |app| collection on your Apple device
(iPad/iPhone/iPod) is by using the |app| content server, which makes your
collection available over the net. First perform the following steps in |app|
* Set the Preferred Output Format in |app| to EPUB (The output format can be set under :guilabel:`Preferences->Interface->Behavior`)
* Set the output profile to iPad (this will work for iPhone/iPods as well), under :guilabel:`Preferences->Conversion->Common Options->Page Setup`
* Convert the books you want to read on your iPhone to EPUB format by selecting them and clicking the Convert button.
* Turn on the Content Server in |app|'s preferences and leave |app| running.
* Set the Preferred Output Format in |app| to EPUB (The output format can be
set under :guilabel:`Preferences->Interface->Behavior`)
* Set the output profile to iPad (this will work for iPhone/iPods as well),
under :guilabel:`Preferences->Conversion->Common Options->Page Setup`
* Convert the books you want to read on your iDevice to EPUB format by
selecting them and clicking the Convert button.
* Turn on the Content Server by clicking the :guilabel:`Connect/Share` button
and leave |app| running. You can also tell |app| to automatically start the
content server via :guilabel:`Preferences->Sharing over the net`.
Now on your iPad/iPhone you have two choices, use either iBooks (version 1.2 and later) or Stanza (version 3.0 and later). Both are available free from the app store.
There are many apps for your iDevice that can connect to |app|. Here we
describe using two of them, iBooks and Stanza.
Using Stanza
***************
Now you should be able to access your books on your iPhone by opening Stanza. Go to "Get Books" and then click the "Shared" tab. Under Shared you will see an entry "Books in calibre". If you don't, make sure your iPad/iPhone is connected using the WiFi network in your house, not 3G. If the |app| catalog is still not detected in Stanza, you can add it manually in Stanza. To do this, click the "Shared" tab, then click the "Edit" button and then click "Add book source" to add a new book source. In the Add Book Source screen enter whatever name you like and in the URL field, enter the following::
You should be able to access your books on your iPhone by opening Stanza. Go to
"Get Books" and then click the "Shared" tab. Under Shared you will see an entry
"Books in calibre". If you don't, make sure your iPad/iPhone is connected using
the WiFi network in your house, not 3G. If the |app| catalog is still not
detected in Stanza, you can add it manually in Stanza. To do this, click the
"Shared" tab, then click the "Edit" button and then click "Add book source" to
add a new book source. In the Add Book Source screen enter whatever name you
like and in the URL field, enter the following::
http://192.168.1.2:8080/
Replace ``192.168.1.2`` with the local IP address of the computer running |app|. If you have changed the port the |app| content server is running on, you will have to change ``8080`` as well to the new port. The local IP address is the IP address you computer is assigned on your home network. A quick Google search will tell you how to find out your local IP address. Now click "Save" and you are done.
Replace ``192.168.1.2`` with the local IP address of the computer running
|app|. If you have changed the port the |app| content server is running on, you
will have to change ``8080`` as well to the new port. The local IP address is
the IP address you computer is assigned on your home network. A quick Google
search will tell you how to find out your local IP address. Now click "Save"
and you are done.
If you get timeout errors while browsing the calibre catalog in Stanza, try increasing the connection timeout value in the stanza settings. Go to Info->Settings and increase the value of Download Timeout.
If you get timeout errors while browsing the calibre catalog in Stanza, try
increasing the connection timeout value in the stanza settings. Go to
Info->Settings and increase the value of Download Timeout.
Using iBooks
**************
Start the Safari browser and type in the IP address and port of the computer running the calibre server, like this::
Start the Safari browser and type in the IP address and port of the computer
running the calibre server, like this::
http://192.168.1.2:8080/
Replace ``192.168.1.2`` with the local IP address of the computer running |app|. If you have changed the port the |app| content server is running on, you will have to change ``8080`` as well to the new port. The local IP address is the IP address you computer is assigned on your home network. A quick Google search will tell you how to find out your local IP address.
Replace ``192.168.1.2`` with the local IP address of the computer running
|app|. If you have changed the port the |app| content server is running on, you
will have to change ``8080`` as well to the new port. The local IP address is
the IP address you computer is assigned on your home network. A quick Google
search will tell you how to find out your local IP address.
You will see a list of books in Safari, just click on the epub link for whichever book you want to read, Safari will then prompt you to open it with iBooks.
You will see a list of books in Safari, just click on the epub link for
whichever book you want to read, Safari will then prompt you to open it with
iBooks.
With the USB cable + iTunes

View File

@ -11,7 +11,7 @@ class Adventure_zone(BasicNewsRecipe):
max_articles_per_feed = 100
cover_url = 'http://www.adventure-zone.info/inne/logoaz_2012.png'
index='http://www.adventure-zone.info/fusion/'
use_embedded_content=False
use_embedded_content = False
preprocess_regexps = [(re.compile(r"<td class='capmain'>Komentarze</td>", re.IGNORECASE), lambda m: ''),
(re.compile(r'</?table.*?>'), lambda match: ''),
(re.compile(r'</?tbody.*?>'), lambda match: '')]
@ -21,7 +21,7 @@ class Adventure_zone(BasicNewsRecipe):
extra_css = '.main-bg{text-align: left;} td.capmain{ font-size: 22px; }'
feeds = [(u'Nowinki', u'http://www.adventure-zone.info/fusion/feeds/news.php')]
def parse_feeds (self):
'''def parse_feeds (self):
feeds = BasicNewsRecipe.parse_feeds(self)
soup=self.index_to_soup(u'http://www.adventure-zone.info/fusion/feeds/news.php')
tag=soup.find(name='channel')
@ -34,7 +34,7 @@ class Adventure_zone(BasicNewsRecipe):
for feed in feeds:
for article in feed.articles[:]:
article.title=titles[feed.articles.index(article)]
return feeds
return feeds'''
'''def get_cover_url(self):
@ -42,16 +42,25 @@ class Adventure_zone(BasicNewsRecipe):
cover=soup.find(id='box_OstatninumerAZ')
self.cover_url='http://www.adventure-zone.info/fusion/'+ cover.center.a.img['src']
return getattr(self, 'cover_url', self.cover_url)'''
def populate_article_metadata(self, article, soup, first):
result = re.search('(.+) - Adventure Zone', soup.title.string)
if result:
article.title = result.group(1)
else:
result = soup.body.find('strong')
if result:
article.title = result.string
def skip_ad_pages(self, soup):
skip_tag = soup.body.find(name='td', attrs={'class':'main-bg'})
skip_tag = skip_tag.findAll(name='a')
for r in skip_tag:
if r.strong:
word=r.strong.string.lower()
if word and (('zapowied' in word) or ('recenzj' in word) or ('solucj' in word) or ('poradnik' in word)):
return self.index_to_soup('http://www.adventure-zone.info/fusion/print.php?type=A&item'+r['href'][r['href'].find('article_id')+7:], raw=True)
title = soup.title.string.lower()
if (('zapowied' in title) or ('recenzj' in title) or ('solucj' in title) or ('poradnik' in title)):
for r in skip_tag:
if r.strong and r.strong.string:
word=r.strong.string.lower()
if (('zapowied' in word) or ('recenzj' in word) or ('solucj' in word) or ('poradnik' in word)):
return self.index_to_soup('http://www.adventure-zone.info/fusion/print.php?type=A&item'+r['href'][r['href'].find('article_id')+7:], raw=True)
def preprocess_html(self, soup):
footer=soup.find(attrs={'class':'news-footer middle-border'})

View File

@ -0,0 +1,17 @@
from calibre.web.feeds.news import BasicNewsRecipe
class BadaniaNet(BasicNewsRecipe):
title = u'badania.net'
__author__ = 'fenuks'
description = u'chcesz wiedzieć więcej?'
category = 'science'
language = 'pl'
cover_url = 'http://badania.net/wp-content/badanianet_green_transparent.png'
oldest_article = 7
max_articles_per_feed = 100
no_stylesheets = True
remove_empty_feeds = True
use_embedded_content = False
remove_tags = [dict(attrs={'class':['omc-flex-category', 'omc-comment-count', 'omc-single-tags']})]
remove_tags_after = dict(attrs={'class':'omc-single-tags'})
keep_only_tags = [dict(id='omc-full-article')]
feeds = [(u'Psychologia', u'http://badania.net/category/psychologia/feed/'), (u'Technologie', u'http://badania.net/category/technologie/feed/'), (u'Biologia', u'http://badania.net/category/biologia/feed/'), (u'Chemia', u'http://badania.net/category/chemia/feed/'), (u'Zdrowie', u'http://badania.net/category/zdrowie/'), (u'Seks', u'http://badania.net/category/psychologia-ewolucyjna-tematyka-seks/feed/')]

View File

@ -35,8 +35,8 @@ class Bash_org_pl(BasicNewsRecipe):
soup=self.index_to_soup(u'http://bash.org.pl/random/')
#date=soup.find('div', attrs={'class':'right'}).string
url=soup.find('a', attrs={'class':'qid click'})
title=url.string
url='http://bash.org.pl' +url['href']
title=''
url='http://bash.org.pl/random/'
articles.append({'title' : title,
'url' : url,
'date' : '',
@ -44,6 +44,8 @@ class Bash_org_pl(BasicNewsRecipe):
})
return articles
def populate_article_metadata(self, article, soup, first):
article.title = soup.find(attrs={'class':'qid click'}).string
def parse_index(self):
feeds = []

View File

@ -15,7 +15,8 @@ class EkologiaPl(BasicNewsRecipe):
no_stylesheets = True
remove_empty_feeds = True
use_embedded_content = False
remove_tags = [dict(attrs={'class':['ekoLogo', 'powrocArt', 'butonDrukuj']})]
remove_attrs = ['style']
remove_tags = [dict(attrs={'class':['ekoLogo', 'powrocArt', 'butonDrukuj', 'widget-social-buttons']})]
feeds = [(u'Wiadomo\u015bci', u'http://www.ekologia.pl/rss/20,53,0'), (u'\u015arodowisko', u'http://www.ekologia.pl/rss/20,56,0'), (u'Styl \u017cycia', u'http://www.ekologia.pl/rss/20,55,0')]

23
recipes/eso_pl.recipe Normal file
View File

@ -0,0 +1,23 @@
from calibre.web.feeds.news import BasicNewsRecipe
class ESO(BasicNewsRecipe):
title = u'ESO PL'
__author__ = 'fenuks'
description = u'ESO, Europejskie Obserwatorium Południowe, buduje i obsługuje najbardziej zaawansowane naziemne teleskopy astronomiczne na świecie'
category = 'astronomy'
language = 'pl'
oldest_article = 7
max_articles_per_feed = 100
no_stylesheets = True
remove_empty_feeds = True
use_embedded_content = False
cover_url = 'https://twimg0-a.akamaihd.net/profile_images/1922519424/eso-twitter-logo.png'
keep_only_tags = [dict(attrs={'class':'subcl'})]
remove_tags = [dict(id='lang_row'), dict(attrs={'class':['pr_typeid', 'pr_news_feature_link', 'outreach_usage', 'hidden']})]
feeds = [(u'Wiadomo\u015bci', u'http://www.eso.org/public/poland/news/feed/'), (u'Og\u0142oszenia', u'http://www.eso.org/public/poland/announcements/feed/'), (u'Zdj\u0119cie tygodnia', u'http://www.eso.org/public/poland/images/potw/feed/')]
def preprocess_html(self, soup):
for a in soup.findAll('a', href=True):
if a['href'].startswith('/'):
a['href'] = 'http://www.eso.org' + a['href']
return soup

Binary file not shown.

After

Width:  |  Height:  |  Size: 968 B

BIN
recipes/icons/eso_pl.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.6 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 726 B

Binary file not shown.

After

Width:  |  Height:  |  Size: 744 B

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.1 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 757 B

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.7 KiB

View File

@ -1,5 +1,4 @@
from calibre.web.feeds.news import BasicNewsRecipe
import re
class Informacje_USA(BasicNewsRecipe):
title = u'Informacje USA'
oldest_article = 7
@ -8,11 +7,10 @@ class Informacje_USA(BasicNewsRecipe):
description = u'portal wiadomości amerykańskich'
category = 'news'
language = 'pl'
masthead_url= 'http://www.informacjeusa.com/wp-content/add_images/top_logo_5_2010.jpg'
cover_url='http://www.informacjeusa.com/wp-content/add_images/top_logo_5_2010.jpg'
cover_url='http://www.informacjeusa.com/wp-content/uploads/2013/01/V3BANNER420-90new.jpg'
no_stylesheets = True
preprocess_regexps = [(re.compile(ur'<p>Zobacz:.*?</p>', re.DOTALL), lambda match: ''), (re.compile(ur'<p><a href=".*?Zobacz także:.*?</a></p>', re.DOTALL), lambda match: ''), (re.compile(ur'<p><p>Zobacz też:.*?</a></p>', re.DOTALL), lambda match: '')]
keep_only_tags=[dict(name='div', attrs={'class':'box box-single'})]
remove_tags_after= dict(attrs={'class':'tags'})
remove_tags= [dict(attrs={'class':['postmetadata', 'tags', 'banner']}), dict(name='a', attrs={'title':['Drukuj', u'Wyślij']})]
use_embedded_content = False
keep_only_tags=[dict(id='post-area')]
remove_tags_after= dict(id='content-area')
remove_tags= [dict(attrs={'class':['breadcrumb']}), dict(id=['social-box', 'social-box-vert'])]
feeds = [(u'Informacje', u'http://www.informacjeusa.com/feed/')]

View File

@ -0,0 +1,14 @@
from calibre.web.feeds.news import BasicNewsRecipe
class KDEFamilyPl(BasicNewsRecipe):
title = u'KDEFamily.pl'
__author__ = 'fenuks'
description = u'KDE w Polsce'
category = 'open source, KDE'
language = 'pl'
cover_url = 'http://www.mykde.home.pl/kdefamily/wp-content/uploads/2012/07/logotype-e1341585198616.jpg'
oldest_article = 7
max_articles_per_feed = 100
no_stylesheets = True
use_embedded_content = True
feeds = [(u'Wszystko', u'http://kdefamily.pl/feed/')]

View File

@ -1,5 +1,5 @@
__license__ = 'GPL v3'
__copyright__ = '2010-2011, Darko Miletic <darko.miletic at gmail.com>'
__copyright__ = '2010-2013, Darko Miletic <darko.miletic at gmail.com>'
'''
www.kommersant.ru
'''
@ -29,16 +29,19 @@ class Kommersant_ru(BasicNewsRecipe):
"""
conversion_options = {
'comment' : description
, 'tags' : category
, 'publisher' : publisher
, 'language' : language
'comment' : description
, 'tags' : category
, 'publisher' : publisher
, 'language' : language
}
keep_only_tags = [dict(attrs={'class':['document','document_vvodka','document_text','document_authors vblock']})]
remove_tags = [dict(name=['iframe','object','link','img','base','meta'])]
feeds = [(u'Articles', u'http://feeds.kommersant.ru/RSS_Export/RU/daily.xml')]
feeds = [(u'Articles', u'http://dynamic.feedsportal.com/pf/438800/http://feeds.kommersant.ru/RSS_Export/RU/daily.xml')]
def get_article_url(self, article):
return article.get('guid', None)
def print_version(self, url):
return url.replace('/doc-rss/','/Doc/') + '/Print'

View File

@ -0,0 +1,56 @@
from calibre.web.feeds.news import BasicNewsRecipe
from calibre.ebooks.BeautifulSoup import BeautifulSoup as bs
class KurierGalicyjski(BasicNewsRecipe):
title = u'Kurier Galicyjski'
__author__ = 'fenuks'
#description = u''
category = 'news'
language = 'pl'
cover_url = 'http://www.duszki.pl/Kurier_galicyjski_bis2_small.gif'
oldest_article = 7
max_articles_per_feed = 100
remove_empty_feeds = True
no_stylesheets = True
keep_only_tags = [dict(attrs={'class':'item-page'})]
remove_tags = [dict(attrs={'class':'pagenav'}), dict(attrs={'style':'border-top-width: thin; border-top-style: dashed; border-top-color: #CCC; border-bottom-width: thin; border-bottom-style: dashed; border-bottom-color: #CCC; padding-top:5px; padding-bottom:5px; text-align:right; margin-top:10px; height:20px;'})]
feeds = [(u'Wydarzenia', u'http://kuriergalicyjski.com/index.php/wydarzenia?format=feed&type=atom'), (u'Publicystyka', u'http://kuriergalicyjski.com/index.php/niezwykle-historie?format=feed&type=atom'), (u'Reporta\u017ce', u'http://kuriergalicyjski.com/index.php/report?format=feed&type=atom'), (u'Rozmowy Kuriera', u'http://kuriergalicyjski.com/index.php/kuriera?format=feed&type=atom'), (u'Przegl\u0105d prasy', u'http://kuriergalicyjski.com/index.php/2012-01-05-14-08-55?format=feed&type=atom'), (u'Kultura', u'http://kuriergalicyjski.com/index.php/2011-12-02-14-26-39?format=feed&type=atom'), (u'Zabytki', u'http://kuriergalicyjski.com/index.php/2011-12-02-14-27-32?format=feed&type=atom'), (u'Polska-Ukraina', u'http://kuriergalicyjski.com/index.php/pol-ua?format=feed&type=atom'), (u'Polacy i Ukrai\u0144cy', u'http://kuriergalicyjski.com/index.php/polacy-i-ukr?format=feed&type=atom'), (u'Niezwyk\u0142e historie', u'http://kuriergalicyjski.com/index.php/niezwykle-historie?format=feed&type=atom'), (u'Polemiki', u'http://kuriergalicyjski.com/index.php/polemiki?format=feed&type=atom')]
def append_page(self, soup, appendtag):
pager = soup.find(id='article-index')
if pager:
pager = pager.findAll('a')[1:]
if pager:
for a in pager:
nexturl = 'http://www.kuriergalicyjski.com' + a['href']
soup2 = self.index_to_soup(nexturl)
pagetext = soup2.find(attrs={'class':'item-page'})
if pagetext.h2:
pagetext.h2.extract()
r = pagetext.find(attrs={'class':'article-info'})
if r:
r.extract()
pos = len(appendtag.contents)
appendtag.insert(pos, pagetext)
pos = len(appendtag.contents)
for r in appendtag.findAll(id='article-index'):
r.extract()
for r in appendtag.findAll(attrs={'class':'pagenavcounter'}):
r.extract()
for r in appendtag.findAll(attrs={'class':'pagination'}):
r.extract()
for r in appendtag.findAll(attrs={'class':'pagenav'}):
r.extract()
for r in appendtag.findAll(attrs={'style':'border-top-width: thin; border-top-style: dashed; border-top-color: #CCC; border-bottom-width: thin; border-bottom-style: dashed; border-bottom-color: #CCC; padding-top:5px; padding-bottom:5px; text-align:right; margin-top:10px; height:20px;'}):
r.extract()
def preprocess_html(self, soup):
self.append_page(soup, soup.body)
for r in soup.findAll(style=True):
del r['style']
for img in soup.findAll(attrs={'class':'easy_img_caption smartresize'}):
img.insert(len(img.contents)-1, bs('<br />'))
img.insert(len(img.contents), bs('<br /><br />'))
for a in soup.findAll('a', href=True):
if a['href'].startswith('/'):
a['href'] = 'http://kuriergalicyjski.com' + a['href']
return soup

View File

@ -1,166 +1,94 @@
#!/usr/bin/env python
# vim:fileencoding=UTF-8:ts=4:sw=4:sta:et:sts=4:fdm=marker:ai
__author__ = 'Sylvain Durand <sylvain.durand@ponts.org>'
__license__ = 'GPL v3'
__copyright__ = '2012, 2013, Rémi Vanicat <vanicat at debian.org>'
'''
Lemonde.fr: Version abonnée
'''
import time
import os, zipfile, re, time
from urllib2 import HTTPError
from calibre.constants import preferred_encoding
from calibre import strftime
from calibre.web.feeds.news import BasicNewsRecipe
from calibre.ebooks.BeautifulSoup import BeautifulSoup
from calibre.ptempfile import PersistentTemporaryFile
from urllib2 import HTTPError
class LeMondeAbonne(BasicNewsRecipe):
class LeMonde(BasicNewsRecipe):
title = u'Le Monde: Édition abonnés'
__author__ = u'Rémi Vanicat'
description = u'Actualités'
category = u'Actualités, France, Monde'
publisher = 'Le Monde'
language = 'fr'
needs_subscription = True
no_stylesheets = True
smarten_punctuation = True
remove_attributes = [ 'border', 'cellspacing', 'display', 'align', 'cellpadding', 'colspan', 'valign', 'vscape', 'hspace', 'alt', 'width', 'height']
extra_css = ''' li{margin:6pt 0}
ul{margin:0}
title = u'Le Monde: Édition abonnés'
__author__ = 'Sylvain Durand'
description = u'Disponible du lundi au samedi à partir de 14 heures environ, avec tous ses cahiers.'
language = 'fr'
encoding = 'utf8'
div.photo img{max-width:100%; border:0px transparent solid;}
div.photo{font-family:inherit; color:#333; text-align:center;}
div.photo p{text-align:justify;font-size:.9em; line-height:.9em;}
needs_subscription = True
@page{margin:10pt}
.ar-txt {color:#000; text-align:justify;}
h1{text-align:left; font-size:1.25em;}
date_url = 'http://www.lemonde.fr/journalelectronique/donnees/libre/%Y%m%d/index.html'
login_url = 'http://www.lemonde.fr/web/journal_electronique/identification/1,56-0,45-0,0.html'
journal_url = 'http://www.lemonde.fr/journalelectronique/donnees/protege/%Y%m%d/%Y%m%d_ipad.xml'
masthead_url = 'http://upload.wikimedia.org/wikipedia/fr/thumb/c/c5/Le_Monde_logo.svg/300px-Le_Monde_logo.svg.png'
couverture_url = 'http://medias.lemonde.fr/abonnes/editionelectronique/%Y%m%d/html/data/img/%y%m%d01.jpg'
.auteur{text-align:right; font-weight:bold}
.feed{text-align:right; font-weight:bold}
.po-ti2{font-weight:bold}
.fen-tt{font-weight:bold;font-size:1.1em}
'''
extra_css = '''
img{max-width:100%}
h1{font-size:1.2em !important; line-height:1.2em !important; }
h2{font-size:1em !important; line-height:1em !important; }
h3{font-size:1em !important; text-transform:uppercase !important; color:#666;}
#photo{text-align:center !important; margin:10px 0 -8px;}
#lgd{font-size:1em !important; line-height:1em !important; font-style:italic; color:#333;} '''
zipurl_format = 'http://medias.lemonde.fr/abonnes/editionelectronique/%Y%m%d/html/%y%m%d.zip'
coverurl_format = '/img/%y%m%d01.jpg'
path_format = "%y%m%d"
login_url = 'http://www.lemonde.fr/web/journal_electronique/identification/1,56-0,45-0,0.html'
keep_only_tags = [dict(name=['h1','h2','h3','div','txt'])]
keep_only_tags = [dict(name=['h1']), dict(name='div', attrs={ 'class': 'photo' }), dict(name='div', attrs={ 'class': 'po-ti2' }), dict(name='div', attrs={ 'class': 'ar-txt' }), dict(name='div', attrs={ 'class': 'po_rtcol' }) ]
remove_tags = [ dict(name='div', attrs={ 'class': 'po-ti' }),dict(name='div', attrs={ 'class': 'po-copy' })]
article_id_pattern = re.compile("[0-9]+\\.html")
article_url_format = 'http://www.lemonde.fr/journalelectronique/donnees/protege/%Y%m%d/html/'
def __init__(self, options, log, progress_reporter):
BasicNewsRecipe.__init__(self, options, log, progress_reporter)
br = BasicNewsRecipe.get_browser(self)
second = time.time() + 24*60*60
for i in range(7):
self.date = time.gmtime(second)
try:
br.open(time.strftime(self.date_url,self.date))
break
except HTTPError:
second -= 24*60*60
self.timefmt = strftime(u" %A %d %B %Y", self.date).replace(u' 0', u' ')
def get_browser(self):
br = BasicNewsRecipe.get_browser(self)
if self.username is not None and self.password is not None:
br.open(self.login_url)
br.select_form(nr=0)
br['login'] = self.username
br['password'] = self.password
br.submit()
br.open(self.login_url)
br.select_form(nr=0)
br['login'] = self.username
br['password'] = self.password
br.submit()
return br
decalage = 24 * 60 * 60 # today Monde has tomorow date
def get_cover_url(self):
url = time.strftime(self.coverurl_format, self.ltime)
return self.articles_path + url
url = time.strftime(self.couverture_url,self.date)
return url
def parse_index(self):
browser = self.get_browser()
second = time.time()
second += self.decalage
for i in range(7):
self.ltime = time.gmtime(second)
self.timefmt=time.strftime(" %A %d %B %Y",self.ltime).decode(preferred_encoding)
url = time.strftime(self.zipurl_format,self.ltime)
try:
response = browser.open(url)
continue
except HTTPError:
second -= 24*60*60
tmp = PersistentTemporaryFile(suffix='.zip')
self.report_progress(0.1,_('downloading zip file'))
tmp.write(response.read())
tmp.close()
zfile = zipfile.ZipFile(tmp.name, 'r')
self.report_progress(0.1,_('extracting zip file'))
zfile.extractall(self.output_dir)
zfile.close()
path = os.path.join(self.output_dir, time.strftime(self.path_format, self.ltime), "data")
self.articles_path = path
files = os.listdir(path)
nb_index_files = len([ name for name in files if re.match("frame_gauche_[0-9]+.html", name) ])
flux = []
article_url = time.strftime(self.article_url_format, self.ltime)
for i in range(nb_index_files):
filename = os.path.join(path, "selection_%d.html" % (i + 1))
tmp = open(filename,'r')
soup=BeautifulSoup(tmp,convertEntities=BeautifulSoup.HTML_ENTITIES)
title=soup.find('span').contents[0]
if title=="Une":
title="À la une"
if title=="Evenement":
title="L'événement"
if title=="Planete":
title="Planète"
if title=="Economie - Entreprises":
title="Économie"
if title=="L'Oeil du Monde":
title="L'œil du Monde"
if title=="Enquete":
title="Enquête"
if title=="Editorial - Analyses":
title="Analyses"
if title=="Le Monde Economie":
title="Économie"
if title=="Le Monde Culture et idées":
title="Idées"
if title=="Le Monde Géo et politique":
title="Géopolitique"
tmp.close()
filename = os.path.join(path, "frame_gauche_%d.html" % (i + 1))
tmp = open(filename,'r')
soup = BeautifulSoup(tmp)
url = time.strftime(self.journal_url,self.date)
soup = self.index_to_soup(url).sommaire
sections = []
for sec in soup.findAll("section"):
articles = []
for link in soup.findAll("a"):
article_file = link['href']
article_id=self.article_id_pattern.search(article_file).group()
article = {
'title': link.contents[0],
'url': article_url + article_id,
'description': '',
'content': ''
}
articles.append(article)
tmp.close()
if sec['cahier'] != "Le Monde":
for col in sec.findAll("fnts"):
col.extract()
if sec['cahier']=="Le Monde Magazine":
continue
for art in sec.findAll("art"):
if art.txt.string and art.ttr.string:
if art.find(['url']):
art.insert(6,'<div id="photo"><img src="'+art.find(['url']).string+'" /></div>')
if art.find(['lgd']) and art.find(['lgd']).string:
art.insert(7,'<div id="lgd">'+art.find(['lgd']).string+'</div>')
article = "<html><head></head><body>"+unicode(art)+"</body></html>"
article = article.replace('<![CDATA[','').replace(']]>','').replace(' oC ','°C ')
article = article.replace('srttr>','h3>').replace('ssttr>','h2>').replace('ttr>','h1>')
f = PersistentTemporaryFile()
f.write(article)
articles.append({'title':art.ttr.string,'url':"file:///"+f.name})
sections.append((sec['nom'], articles))
return sections
flux.append((title, articles))
def preprocess_html(self, soup):
for lgd in soup.findAll(id="lgd"):
lgd.contents[-1].extract()
return soup
return flux
# Local Variables:
# mode: python
# End:

View File

@ -1,5 +1,5 @@
# vim:fileencoding=UTF-8:ts=4:sw=4:sta:et:sts=4:ai
import re
from calibre.web.feeds.news import BasicNewsRecipe
class Mlody_technik(BasicNewsRecipe):
title = u'Młody technik'
@ -9,7 +9,19 @@ class Mlody_technik(BasicNewsRecipe):
language = 'pl'
cover_url='http://science-everywhere.pl/wp-content/uploads/2011/10/mt12.jpg'
no_stylesheets = True
preprocess_regexps = [(re.compile(r"<h4>Podobne</h4>", re.IGNORECASE), lambda m: '')]
oldest_article = 7
max_articles_per_feed = 100
#keep_only_tags=[dict(id='container')]
feeds = [(u'Artyku\u0142y', u'http://www.mt.com.pl/feed')]
remove_empty_feeds = True
use_embedded_content = False
keep_only_tags = [dict(id='content')]
remove_tags = [dict(attrs={'class':'st-related-posts'})]
remove_tags_after = dict(attrs={'class':'entry-content clearfix'})
feeds = [(u'Wszystko', u'http://www.mt.com.pl/feed'),
(u'MT NEWS 24/7', u'http://www.mt.com.pl/kategoria/mt-newsy-24-7/feed'),
(u'Info zoom', u'http://www.mt.com.pl/kategoria/info-zoom/feed'),
(u'm.technik', u'http://www.mt.com.pl/kategoria/m-technik/feed'),
(u'Szkoła', u'http://www.mt.com.pl/kategoria/szkola-2/feed'),
(u'Na Warsztacie', u'http://www.mt.com.pl/kategoria/na-warsztacie/feed'),
(u'Z pasji do...', u'http://www.mt.com.pl/kategoria/z-pasji-do/feed'),
(u'MT testuje', u'http://www.mt.com.pl/kategoria/mt-testuje/feed')]

View File

@ -0,0 +1,47 @@
from calibre.web.feeds.news import BasicNewsRecipe
import re
class NaukawPolsce(BasicNewsRecipe):
title = u'Nauka w Polsce'
__author__ = 'fenuks'
description = u'Serwis Nauka w Polsce ma za zadanie popularyzację polskiej nauki. Można na nim znaleźć wiadomości takie jak: osiągnięcia polskich naukowców, wydarzenia na polskich uczelniach, osiągnięcia studentów, konkursy dla badaczy, staże i stypendia naukowe, wydarzenia w polskiej nauce, kalendarium wydarzeń w nauce, materiały wideo o nauce.'
category = 'science'
language = 'pl'
cover_url = 'http://www.naukawpolsce.pap.pl/Themes/Pap/images/logo-pl.gif'
oldest_article = 7
max_articles_per_feed = 100
no_stylesheets = True
remove_empty_feeds = True
index = 'http://www.naukawpolsce.pl'
keep_only_tags = [dict(name='div', attrs={'class':'margines wiadomosc'})]
remove_tags = [dict(name='div', attrs={'class':'tagi'})]
def find_articles(self, url):
articles = []
soup=self.index_to_soup(url)
for i in soup.findAll(name='div', attrs={'class':'aktualnosci-margines lista-depesz information-content'}):
title = i.h1.a.string
url = self.index + i.h1.a['href']
date = '' #i.span.string
articles.append({'title' : title,
'url' : url,
'date' : date,
'description' : ''
})
return articles
def parse_index(self):
feeds = []
feeds.append((u"Historia i kultura", self.find_articles('http://www.naukawpolsce.pl/historia-i-kultura/')))
feeds.append((u"Kosmos", self.find_articles('http://www.naukawpolsce.pl/kosmos/')))
feeds.append((u"Przyroda", self.find_articles('http://www.naukawpolsce.pl/przyroda/')))
feeds.append((u"Społeczeństwo", self.find_articles('http://www.naukawpolsce.pl/spoleczenstwo/')))
feeds.append((u"Technologie", self.find_articles('http://www.naukawpolsce.pl/technologie/')))
feeds.append((u"Uczelnie", self.find_articles('http://www.naukawpolsce.pl/uczelnie/')))
feeds.append((u"Nauki medyczne", self.find_articles('http://www.naukawpolsce.pl/zdrowie/')))
return feeds
def preprocess_html(self, soup):
for p in soup.findAll(name='p', text=re.compile('&nbsp;')):
p.extract()
return soup

View File

@ -1,5 +1,5 @@
__license__ = 'GPL v3'
__copyright__ = '2008-2011, Darko Miletic <darko.miletic at gmail.com>'
__copyright__ = '2008-2013, Darko Miletic <darko.miletic at gmail.com>'
'''
newyorker.com
'''
@ -44,20 +44,18 @@ class NewYorker(BasicNewsRecipe):
, 'language' : language
}
keep_only_tags = [
dict(name='div', attrs={'class':'headers'})
,dict(name='div', attrs={'id':['articleheads','items-container','articleRail','articletext','photocredits']})
]
keep_only_tags = [dict(name='div', attrs={'id':'pagebody'})]
remove_tags = [
dict(name=['meta','iframe','base','link','embed','object'])
,dict(attrs={'class':['utils','socialUtils','articleRailLinks','icons'] })
,dict(attrs={'class':['utils','socialUtils','articleRailLinks','icons','social-utils-top','entry-keywords','entry-categories','utilsPrintEmail'] })
,dict(attrs={'id':['show-header','show-footer'] })
]
remove_tags_after = dict(attrs={'class':'entry-content'})
remove_attributes = ['lang']
feeds = [(u'The New Yorker', u'http://www.newyorker.com/services/mrss/feeds/everything.xml')]
def print_version(self, url):
return url + '?printable=true'
return url + '?printable=true&currentPage=all'
def image_url_processor(self, baseurl, url):
return url.strip()

33
recipes/osworld_pl.recipe Normal file
View File

@ -0,0 +1,33 @@
from calibre.web.feeds.news import BasicNewsRecipe
class OSWorld(BasicNewsRecipe):
title = u'OSWorld.pl'
__author__ = 'fenuks'
description = u'OSWorld.pl to serwis internetowy, dzięki któremu poznasz czym naprawdę jest Open Source. Serwis poświęcony jest wolnemu oprogramowaniu jak linux mint, centos czy ubunty. Znajdziecie u nasz artykuły, unity oraz informacje o certyfikatach CACert. OSWorld to mały świat wielkich systemów!'
category = 'OS, IT, open source, Linux'
language = 'pl'
cover_url = 'http://osworld.pl/wp-content/uploads/osworld-kwadrat-128x111.png'
oldest_article = 7
max_articles_per_feed = 100
no_stylesheets = True
remove_empty_feeds = True
use_embedded_content = False
keep_only_tags = [dict(id=['dzial', 'posts'])]
remove_tags = [dict(attrs={'class':'post-comments'})]
remove_tags_after = dict(attrs={'class':'entry clr'})
feeds = [(u'Artyku\u0142y', u'http://osworld.pl/category/artykuly/feed/'), (u'Nowe wersje', u'http://osworld.pl/category/nowe-wersje/feed/')]
def append_page(self, soup, appendtag):
tag = appendtag.find(attrs={'id':'paginacja'})
if tag:
for nexturl in tag.findAll('a'):
soup2 = self.index_to_soup(nexturl['href'])
pagetext = soup2.find(attrs={'class':'entry clr'})
pos = len(appendtag.contents)
appendtag.insert(pos, pagetext)
for r in appendtag.findAll(attrs={'id':'paginacja'}):
r.extract()
def preprocess_html(self, soup):
self.append_page(soup, soup.body)
return soup

View File

@ -1,5 +1,4 @@
#!/usr/bin/env python
from calibre.web.feeds.recipes import BasicNewsRecipe
class PCLab(BasicNewsRecipe):
@ -8,12 +7,13 @@ class PCLab(BasicNewsRecipe):
__author__ = 'ravcio - rlelusz[at]gmail.com'
description = u"Articles from PC Lab website"
language = 'pl'
oldest_article = 30.0
oldest_article = 30
max_articles_per_feed = 100
recursions = 0
encoding = 'iso-8859-2'
no_stylesheets = True
remove_javascript = True
remove_empty_feeds = True
use_embedded_content = False
keep_only_tags = [
@ -21,50 +21,54 @@ class PCLab(BasicNewsRecipe):
]
remove_tags = [
dict(name='div', attrs={'class':['chapters']})
,dict(name='div', attrs={'id':['script_bxad_slot_display_list_bxad_slot']})
dict(name='div', attrs={'class':['toc first', 'toc', 'tags', 'recommendedarticles', 'name', 'zumi', 'chapters']})
]
remove_tags_after = [
dict(name='div', attrs={'class':['navigation']})
]
#links to RSS feeds
feeds = [ ('PCLab', u'http://pclab.pl/xml/artykuly.xml') ]
feeds = [
(u'Aktualności', 'http://pclab.pl/xml/aktualnosci.xml'),
(u'Artykuły', u'http://pclab.pl/xml/artykuly.xml'),
(u'Poradniki', 'http://pclab.pl/xml/poradniki.xml')
]
#load second and subsequent page content
# in: soup - full page with 'next' button
# out: appendtag - tag to which new page is to be added
def append_page(self, soup, appendtag):
# find the 'Next' button
pager = soup.find('div', attrs={'class':'next'})
pager = soup.find('div', attrs={'class':'navigation'})
if pager:
a = pager.find('a')
if 'news' in a['href']:
pager = None
else:
pager = pager.find('div', attrs={'class':'next'})
while pager:
#search for 'a' element with link to next page (exit if not found)
a = pager.find('a')
if a:
nexturl = a['href']
nexturl = a['href']
soup2 = self.index_to_soup('http://pclab.pl' + nexturl)
pager = soup2.find('div', attrs={'class':'next'})
pagetext = soup2.find('div', attrs={'class':'substance'})
pagetext = pagetext.find('div', attrs={'class':'data'})
soup2 = self.index_to_soup('http://pclab.pl/' + nexturl)
pagetext_substance = soup2.find('div', attrs={'class':'substance'})
pagetext = pagetext_substance.find('div', attrs={'class':'data'})
pagetext.extract()
pos = len(appendtag.contents)
appendtag.insert(pos, pagetext)
pos = len(appendtag.contents)
self.append_page(soup2, appendtag)
pos = len(appendtag.contents)
appendtag.insert(pos, pagetext)
pos = len(appendtag.contents)
pager = soup.find('div', attrs={'class':'navigation'})
if pager:
pager.extract()
def preprocess_html(self, soup):
# soup.body contains no title and no navigator, they are in soup
self.append_page(soup, soup.body)
for link in soup.findAll('a'):
href = link.get('href', None)
if href and href.startswith('/'):
link['href'] = 'http://pclab.pl' + href
# finally remove some tags
tags = soup.findAll('div',attrs={'class':['tags', 'index', 'script_bxad_slot_display_list_bxad_slot', 'index first', 'zumi', 'navigation']})
[tag.extract() for tag in tags]
#for r in soup.findAll('div', attrs={'class':['tags', 'index', 'script_bxad_slot_display_list_bxad_slot', 'index first', 'zumi', 'navigation']})
return soup

View File

@ -5,11 +5,14 @@ class SpidersWeb(BasicNewsRecipe):
oldest_article = 7
__author__ = 'fenuks'
description = u''
cover_url = 'http://www.spidersweb.pl/wp-content/themes/spiderweb/img/Logo.jpg'
cover_url = 'http://www.spidersweb.pl/wp-content/themes/new_sw/images/spidersweb.png'
category = 'IT, WEB'
language = 'pl'
no_stylesheers=True
remove_javascript = True
use_embedded_content = False
max_articles_per_feed = 100
keep_only_tags=[dict(id='Post')]
remove_tags=[dict(name='div', attrs={'class':['Comments', 'Shows', 'Post-Tags']}), dict(id='Author-Column')]
keep_only_tags=[dict(id='start')]
remove_tags_after = dict(attrs={'class':'padding20'})
remove_tags=[dict(name='div', attrs={'class':['padding border-bottom', 'padding20', 'padding border-top']})]
feeds = [(u'Wpisy', u'http://www.spidersweb.pl/feed')]

View File

@ -0,0 +1,22 @@
import re
from calibre.web.feeds.news import BasicNewsRecipe
class UbuntuPomoc(BasicNewsRecipe):
title = u'Ubuntu-pomoc.org'
__author__ = 'fenuks'
description = u'Strona poświęcona systemowi Ubuntu Linux. Znajdziesz tutaj przydatne i sprawdzone poradniki oraz sposoby rozwiązywania wielu popularnych problemów. Ten blog rozwiąże każdy Twój problem - jeśli nie teraz, to wkrótce! :)'
category = 'Linux, Ubuntu, open source'
language = 'pl'
cover_url = 'http://www.ubuntu-pomoc.org/grafika/ubuntupomoc.png'
preprocess_regexps = [(re.compile(r'<div class="ciekawostka">.+', re.IGNORECASE|re.DOTALL), lambda m: '')]
oldest_article = 7
max_articles_per_feed = 100
no_stylesheets = True
remove_javascript = True
remove_empty_feeds = True
use_embedded_content = False
remove_attrs = ['style']
keep_only_tags = [dict(attrs={'class':'post'})]
remove_tags_after = dict(attrs={'class':'underEntry'})
remove_tags = [dict(attrs={'class':['underPostTitle', 'yarpp-related', 'underEntry', 'social', 'tags', 'commentlist', 'youtube_sc']}), dict(id=['wp_rp_first', 'commentReply'])]
feeds = [(u'Ca\u0142o\u015b\u0107', u'http://feeds.feedburner.com/Ubuntu-Pomoc'),
(u'Gry', u'http://feeds.feedburner.com/GryUbuntu-pomoc')]

View File

@ -10,89 +10,89 @@ from calibre.web.feeds.news import BasicNewsRecipe
import re
class Wprost(BasicNewsRecipe):
EDITION = 0
FIND_LAST_FULL_ISSUE = True
EXCLUDE_LOCKED = True
ICO_BLOCKED = 'http://www.wprost.pl/G/layout2/ico_blocked.png'
EDITION = 0
FIND_LAST_FULL_ISSUE = True
EXCLUDE_LOCKED = True
ICO_BLOCKED = 'http://www.wprost.pl/G/layout2/ico_blocked.png'
title = u'Wprost'
__author__ = 'matek09'
description = 'Weekly magazine'
encoding = 'ISO-8859-2'
no_stylesheets = True
language = 'pl'
remove_javascript = True
recursions = 0
remove_tags_before = dict(dict(name = 'div', attrs = {'id' : 'print-layer'}))
remove_tags_after = dict(dict(name = 'div', attrs = {'id' : 'print-layer'}))
'''
keep_only_tags =[]
keep_only_tags.append(dict(name = 'table', attrs = {'id' : 'title-table'}))
keep_only_tags.append(dict(name = 'div', attrs = {'class' : 'div-header'}))
keep_only_tags.append(dict(name = 'div', attrs = {'class' : 'div-content'}))
keep_only_tags.append(dict(name = 'div', attrs = {'class' : 'def element-autor'}))
'''
title = u'Wprost'
__author__ = 'matek09'
description = 'Weekly magazine'
encoding = 'ISO-8859-2'
no_stylesheets = True
language = 'pl'
remove_javascript = True
recursions = 0
remove_tags_before = dict(dict(name = 'div', attrs = {'id' : 'print-layer'}))
remove_tags_after = dict(dict(name = 'div', attrs = {'id' : 'print-layer'}))
'''keep_only_tags =[]
keep_only_tags.append(dict(name = 'table', attrs = {'id' : 'title-table'}))
keep_only_tags.append(dict(name = 'div', attrs = {'class' : 'div-header'}))
keep_only_tags.append(dict(name = 'div', attrs = {'class' : 'div-content'}))
keep_only_tags.append(dict(name = 'div', attrs = {'class' : 'def element-autor'}))'''
preprocess_regexps = [(re.compile(r'style="display: none;"'), lambda match: ''),
preprocess_regexps = [(re.compile(r'style="display: none;"'), lambda match: ''),
(re.compile(r'display: block;'), lambda match: ''),
(re.compile(r'\<td\>\<tr\>\<\/table\>'), lambda match: ''),
(re.compile(r'\<table .*?\>'), lambda match: ''),
(re.compile(r'\<tr>'), lambda match: ''),
(re.compile(r'\<td .*?\>'), lambda match: ''),
(re.compile(r'\<div id="footer"\>.*?\</footer\>'), lambda match: '')]
(re.compile(r'\<div id="footer"\>.*?\</footer\>'), lambda match: '')]
remove_tags =[]
remove_tags.append(dict(name = 'div', attrs = {'class' : 'def element-date'}))
remove_tags.append(dict(name = 'div', attrs = {'class' : 'def silver'}))
remove_tags.append(dict(name = 'div', attrs = {'id' : 'content-main-column-right'}))
remove_tags =[]
remove_tags.append(dict(name = 'div', attrs = {'class' : 'def element-date'}))
remove_tags.append(dict(name = 'div', attrs = {'class' : 'def silver'}))
remove_tags.append(dict(name = 'div', attrs = {'id' : 'content-main-column-right'}))
extra_css = '''
.div-header {font-size: x-small; font-weight: bold}
'''
#h2 {font-size: x-large; font-weight: bold}
def is_blocked(self, a):
if a.findNextSibling('img') is None:
return False
else:
return True
extra_css = '''.div-header {font-size: x-small; font-weight: bold}'''
#h2 {font-size: x-large; font-weight: bold}
def is_blocked(self, a):
if a.findNextSibling('img') is None:
return False
else:
return True
def find_last_issue(self):
soup = self.index_to_soup('http://www.wprost.pl/archiwum/')
a = 0
if self.FIND_LAST_FULL_ISSUE:
ico_blocked = soup.findAll('img', attrs={'src' : self.ICO_BLOCKED})
a = ico_blocked[-1].findNext('a', attrs={'title' : re.compile(r'Spis *', re.IGNORECASE | re.DOTALL)})
else:
a = soup.find('a', attrs={'title' : re.compile(r'Spis *', re.IGNORECASE | re.DOTALL)})
self.EDITION = a['href'].replace('/tygodnik/?I=', '')
self.EDITION_SHORT = a['href'].replace('/tygodnik/?I=15', '')
self.cover_url = a.img['src']
def find_last_issue(self):
soup = self.index_to_soup('http://www.wprost.pl/archiwum/')
a = 0
if self.FIND_LAST_FULL_ISSUE:
ico_blocked = soup.findAll('img', attrs={'src' : self.ICO_BLOCKED})
a = ico_blocked[-1].findNext('a', attrs={'title' : re.compile(r'Spis *', re.IGNORECASE | re.DOTALL)})
else:
a = soup.find('a', attrs={'title' : re.compile(r'Spis *', re.IGNORECASE | re.DOTALL)})
self.EDITION = a['href'].replace('/tygodnik/?I=', '')
self.EDITION_SHORT = a['href'].replace('/tygodnik/?I=15', '')
self.cover_url = a.img['src']
def parse_index(self):
self.find_last_issue()
soup = self.index_to_soup('http://www.wprost.pl/tygodnik/?I=' + self.EDITION)
feeds = []
headers = soup.findAll(attrs={'class':'block-header block-header-left mtop20 mbottom20'})
articles_list = soup.findAll(attrs={'class':'standard-box'})
for i in range(len(headers)):
articles = self.find_articles(articles_list[i])
if len(articles) > 0:
section = headers[i].find('a').string
feeds.append((section, articles))
return feeds
def parse_index(self):
self.find_last_issue()
soup = self.index_to_soup('http://www.wprost.pl/tygodnik/?I=' + self.EDITION)
feeds = []
for main_block in soup.findAll(attrs={'id': 'content-main-column-element-content'}):
articles = list(self.find_articles(main_block))
if len(articles) > 0:
section = self.tag_to_string(main_block.find('h3'))
feeds.append((section, articles))
return feeds
def find_articles(self, main_block):
for a in main_block.findAll('a'):
if a.name in "td":
break
if self.EXCLUDE_LOCKED & self.is_blocked(a):
continue
yield {
'title' : self.tag_to_string(a),
'url' : 'http://www.wprost.pl' + a['href'],
'date' : '',
'description' : ''
}
def find_articles(self, main_block):
articles = []
for a in main_block.findAll('a'):
if a.name in "td":
break
if self.EXCLUDE_LOCKED and self.is_blocked(a):
continue
articles.append({
'title' : self.tag_to_string(a),
'url' : 'http://www.wprost.pl' + a['href'],
'date' : '',
'description' : ''
})
return articles

71
recipes/wprost_rss.recipe Normal file
View File

@ -0,0 +1,71 @@
#!/usr/bin/env python
__license__ = 'GPL v3'
__copyright__ = '2010, matek09, matek09@gmail.com'
__copyright__ = 'Modified 2011, Mariusz Wolek <mariusz_dot_wolek @ gmail dot com>'
__copyright__ = 'Modified 2012, Artur Stachecki <artur.stachecki@gmail.com>'
from calibre.web.feeds.news import BasicNewsRecipe
import re
class Wprost(BasicNewsRecipe):
title = u'Wprost (RSS)'
__author__ = 'matek09'
description = 'Weekly magazine'
encoding = 'ISO-8859-2'
no_stylesheets = True
language = 'pl'
remove_javascript = True
recursions = 0
use_embedded_content = False
remove_empty_feeds = True
remove_tags_before = dict(dict(name = 'div', attrs = {'id' : 'print-layer'}))
remove_tags_after = dict(dict(name = 'div', attrs = {'id' : 'print-layer'}))
'''
keep_only_tags =[]
keep_only_tags.append(dict(name = 'table', attrs = {'id' : 'title-table'}))
keep_only_tags.append(dict(name = 'div', attrs = {'class' : 'div-header'}))
keep_only_tags.append(dict(name = 'div', attrs = {'class' : 'div-content'}))
keep_only_tags.append(dict(name = 'div', attrs = {'class' : 'def element-autor'}))
'''
preprocess_regexps = [(re.compile(r'style="display: none;"'), lambda match: ''),
(re.compile(r'display: block;'), lambda match: ''),
(re.compile(r'\<td\>\<tr\>\<\/table\>'), lambda match: ''),
(re.compile(r'\<table .*?\>'), lambda match: ''),
(re.compile(r'\<tr>'), lambda match: ''),
(re.compile(r'\<td .*?\>'), lambda match: ''),
(re.compile(r'\<div id="footer"\>.*?\</footer\>'), lambda match: '')]
remove_tags =[]
remove_tags.append(dict(name = 'div', attrs = {'class' : 'def element-date'}))
remove_tags.append(dict(name = 'div', attrs = {'class' : 'def silver'}))
remove_tags.append(dict(name = 'div', attrs = {'id' : 'content-main-column-right'}))
extra_css = '''.div-header {font-size: x-small; font-weight: bold}'''
#h2 {font-size: x-large; font-weight: bold}
feeds = [(u'Tylko u nas', u'http://www.wprost.pl/rss/rss_wprostextra.php'),
(u'Wydarzenia', u'http://www.wprost.pl/rss/rss.php'),
(u'Komentarze', u'http://www.wprost.pl/rss/rss_komentarze.php'),
(u'Wydarzenia: Kraj', u'http://www.wprost.pl/rss/rss_kraj.php'),
(u'Komentarze: Kraj', u'http://www.wprost.pl/rss/rss_komentarze_kraj.php'),
(u'Wydarzenia: Świat', u'http://www.wprost.pl/rss/rss_swiat.php'),
(u'Komentarze: Świat', u'http://www.wprost.pl/rss/rss_komentarze_swiat.php'),
(u'Wydarzenia: Gospodarka', u'http://www.wprost.pl/rss/rss_gospodarka.php'),
(u'Komentarze: Gospodarka', u'http://www.wprost.pl/rss/rss_komentarze_gospodarka.php'),
(u'Wydarzenia: Życie', u'http://www.wprost.pl/rss/rss_zycie.php'),
(u'Komentarze: Życie', u'http://www.wprost.pl/rss/rss_komentarze_zycie.php'),
(u'Wydarzenia: Sport', u'http://www.wprost.pl/rss/rss_sport.php'),
(u'Komentarze: Sport', u'http://www.wprost.pl/rss/rss_komentarze_sport.php'),
(u'Przegląd prasy', u'http://www.wprost.pl/rss/rss_prasa.php')
]
def get_cover_url(self):
soup = self.index_to_soup('http://www.wprost.pl/tygodnik')
cover = soup.find(attrs={'class':'wprost-cover'})
if cover:
self.cover_url = cover['src']
return getattr(self, 'cover_url', self.cover_url)

View File

@ -1,5 +1,5 @@
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" version="XHTML 1.1" xml:lang="en">
<head>
<title>calibre library</title>

View File

@ -464,12 +464,15 @@ server_listen_on = '0.0.0.0'
# on at your own risk!
unified_title_toolbar_on_osx = False
#: Save original file when converting from same format to same format
#: Save original file when converting/polishing from same format to same format
# When calibre does a conversion from the same format to the same format, for
# example, from EPUB to EPUB, the original file is saved, so that in case the
# conversion is poor, you can tweak the settings and run it again. By setting
# this to False you can prevent calibre from saving the original file.
# Similarly, by setting save_original_format_when_polishing to False you can
# prevent calibre from saving the original file when polishing.
save_original_format = True
save_original_format_when_polishing = True
#: Number of recently viewed books to show
# Right-clicking the View button shows a list of recently viewed books. Control

View File

@ -4,7 +4,7 @@ __license__ = 'GPL v3'
__copyright__ = '2008, Kovid Goyal kovid@kovidgoyal.net'
__docformat__ = 'restructuredtext en'
__appname__ = u'calibre'
numeric_version = (0, 9, 18)
numeric_version = (0, 9, 19)
__version__ = u'.'.join(map(unicode, numeric_version))
__author__ = u"Kovid Goyal <kovid@kovidgoyal.net>"

View File

@ -209,8 +209,9 @@ class ALURATEK_COLOR(USBMS):
EBOOK_DIR_MAIN = EBOOK_DIR_CARD_A = 'books'
VENDOR_NAME = ['USB_2.0', 'EZREADER']
WINDOWS_MAIN_MEM = WINDOWS_CARD_A_MEM = ['USB_FLASH_DRIVER', '.']
VENDOR_NAME = ['USB_2.0', 'EZREADER', 'C4+']
WINDOWS_MAIN_MEM = WINDOWS_CARD_A_MEM = ['USB_FLASH_DRIVER', '.', 'TOUCH']
SCAN_FROM_ROOT = True
class TREKSTOR(USBMS):
@ -225,6 +226,7 @@ class TREKSTOR(USBMS):
VENDOR_ID = [0x1e68]
PRODUCT_ID = [0x0041, 0x0042, 0x0052, 0x004e, 0x0056,
0x0067, # This is for the Pyrus Mini
0x003e, # This is for the EBOOK_PLAYER_5M https://bugs.launchpad.net/bugs/792091
0x5cL, # This is for the 4ink http://www.mobileread.com/forums/showthread.php?t=191318
]
@ -234,7 +236,7 @@ class TREKSTOR(USBMS):
VENDOR_NAME = 'TREKSTOR'
WINDOWS_MAIN_MEM = WINDOWS_CARD_A_MEM = ['EBOOK_PLAYER_7',
'EBOOK_PLAYER_5M', 'EBOOK-READER_3.0', 'EREADER_PYRUS']
'EBOOK_PLAYER_5M', 'EBOOK-READER_3.0', 'EREADER_PYRUS', 'PYRUS_MINI']
SUPPORTS_SUB_DIRS = True
SUPPORTS_SUB_DIRS_DEFAULT = False

View File

@ -80,7 +80,7 @@ class EPUBInput(InputFormatPlugin):
guide_cover, guide_elem = None, None
for guide_elem in opf.iterguide():
if guide_elem.get('type', '').lower() == 'cover':
guide_cover = guide_elem.get('href', '')
guide_cover = guide_elem.get('href', '').partition('#')[0]
break
if not guide_cover:
return
@ -103,6 +103,12 @@ class EPUBInput(InputFormatPlugin):
if not self.for_viewer:
spine[0].getparent().remove(spine[0])
removed = guide_cover
else:
# Ensure the cover is displayed as the first item in the book, some
# epub files have it set with linear='no' which causes the cover to
# display in the end
spine[0].attrib.pop('linear', None)
opf.spine[0].is_linear = True
guide_elem.set('href', 'calibre_raster_cover.jpg')
from calibre.ebooks.oeb.base import OPF
t = etree.SubElement(elem[0].getparent(), OPF('item'),

View File

@ -82,8 +82,8 @@ class OEBOutput(OutputFormatPlugin):
self.log.warn('The cover image has an id != "cover". Renaming'
' to work around bug in Nook Color')
import uuid
newid = str(uuid.uuid4())
from calibre.ebooks.oeb.base import uuid_id
newid = uuid_id()
for item in manifest_items_with_id('cover'):
item.set('id', newid)

View File

@ -7,13 +7,14 @@ __license__ = 'GPL v3'
__copyright__ = '2013, Kovid Goyal <kovid at kovidgoyal.net>'
__docformat__ = 'restructuredtext en'
import os, logging, sys, hashlib, uuid
import os, logging, sys, hashlib, uuid, re
from io import BytesIO
from urllib import unquote as urlunquote, quote as urlquote
from urlparse import urlparse
from lxml import etree
from calibre import guess_type, CurrentDir
from calibre import guess_type as _guess_type, CurrentDir
from calibre.customize.ui import (plugin_for_input_format,
plugin_for_output_format)
from calibre.ebooks.chardet import xml_to_unicode
@ -34,7 +35,10 @@ from calibre.utils.zipfile import ZipFile
exists, join, relpath = os.path.exists, os.path.join, os.path.relpath
OEB_FONTS = {guess_type('a.ttf')[0], guess_type('b.ttf')[0]}
def guess_type(x):
return _guess_type(x)[0] or 'application/octet-stream'
OEB_FONTS = {guess_type('a.ttf'), guess_type('b.ttf')}
OPF_NAMESPACES = {'opf':OPF2_NS, 'dc':DC11_NS}
class Container(object):
@ -76,12 +80,12 @@ class Container(object):
path = join(dirpath, f)
name = self.abspath_to_name(path)
self.name_path_map[name] = path
self.mime_map[name] = guess_type(path)[0]
self.mime_map[name] = guess_type(path)
# Special case if we have stumbled onto the opf
if path == opfpath:
self.opf_name = name
self.opf_dir = os.path.dirname(path)
self.mime_map[name] = guess_type('a.opf')[0]
self.mime_map[name] = guess_type('a.opf')
if not hasattr(self, 'opf_name'):
raise InvalidBook('Book has no OPF file')
@ -205,7 +209,7 @@ class Container(object):
def parsed(self, name):
ans = self.parsed_cache.get(name, None)
if ans is None:
mime = self.mime_map.get(name, guess_type(name)[0])
mime = self.mime_map.get(name, guess_type(name))
ans = self.parse(self.name_path_map[name], mime)
self.parsed_cache[name] = ans
return ans
@ -214,6 +218,13 @@ class Container(object):
def opf(self):
return self.parsed(self.opf_name)
@property
def mi(self):
from calibre.ebooks.metadata.opf2 import OPF as O
mi = self.serialize_item(self.opf_name)
return O(BytesIO(mi), basedir=self.opf_dir, unquote_urls=False,
populate_spine=False).to_book_metadata()
@property
def manifest_id_map(self):
return {item.get('id'):self.href_to_name(item.get('href'), self.opf_name)
@ -333,7 +344,7 @@ class Container(object):
name. Ensures uniqueness of href and id automatically. Returns
generated item.'''
id_prefix = id_prefix or 'id'
media_type = media_type or guess_type(name)[0]
media_type = media_type or guess_type(name)
href = self.name_to_href(name, self.opf_name)
base, ext = href.rpartition('.')[0::2]
all_ids = {x.get('id') for x in self.opf_xpath('//*[@id]')}
@ -353,7 +364,7 @@ class Container(object):
c += 1
href = '%s_%d.%s'%(base, c, ext)
manifest = self.opf_xpath('//opf:manifest')[0]
item = manifest.makeelement(OPF('item'), nsmap=OPF_NAMESPACES,
item = manifest.makeelement(OPF('item'),
id=item_id, href=href)
item.set('media-type', media_type)
self.insert_into_xml(manifest, item)
@ -363,10 +374,36 @@ class Container(object):
self.mime_map[name] = media_type
return item
def commit_item(self, name):
self.dirtied.remove(name)
data = self.parsed_cache.pop(name)
def format_opf(self):
mdata = self.opf_xpath('//opf:metadata')[0]
mdata.text = '\n '
remove = set()
for child in mdata:
child.tail = '\n '
if (child.get('name', '').startswith('calibre:') and
child.get('content', '').strip() in {'{}', ''}):
remove.add(child)
for child in remove: mdata.remove(child)
if len(mdata) > 0:
mdata[-1].tail = '\n '
def serialize_item(self, name):
data = self.parsed(name)
if name == self.opf_name:
self.format_opf()
data = serialize(data, self.mime_map[name])
if name == self.opf_name:
# Needed as I can't get lxml to output opf:role and
# not output <opf:metadata> as well
data = re.sub(br'(<[/]{0,1})opf:', r'\1', data)
return data
def commit_item(self, name):
if name not in self.parsed_cache:
return
data = self.serialize_item(name)
self.dirtied.remove(name)
self.parsed_cache.pop(name)
with open(self.name_path_map[name], 'wb') as f:
f.write(data)
@ -442,7 +479,7 @@ class EpubContainer(Container):
self.container = etree.fromstring(open(container_path, 'rb').read())
opf_files = self.container.xpath((
r'child::ocf:rootfiles/ocf:rootfile'
'[@media-type="%s" and @full-path]'%guess_type('a.opf')[0]
'[@media-type="%s" and @full-path]'%guess_type('a.opf')
), namespaces={'ocf':OCF_NS}
)
if not opf_files:
@ -521,7 +558,7 @@ class EpubContainer(Container):
outpath = self.pathtoepub
from calibre.ebooks.tweak import zip_rebuilder
with open(join(self.root, 'mimetype'), 'wb') as f:
f.write(guess_type('a.epub')[0])
f.write(guess_type('a.epub'))
zip_rebuilder(self.root, outpath)
# }}}

View File

@ -11,6 +11,7 @@ import shutil, re, os
from calibre.ebooks.oeb.base import OPF, OEB_DOCS, XPath, XLINK, xml2text
from calibre.ebooks.oeb.polish.replace import replace_links
from calibre.utils.magick.draw import identify
def set_azw3_cover(container, cover_path, report):
name = None
@ -144,6 +145,10 @@ def create_epub_cover(container, cover_path):
templ = CoverManager.NONSVG_TEMPLATE.replace('__style__', style)
else:
width, height = 600, 800
try:
width, height = identify(cover_path)[:2]
except:
container.log.exception("Failed to get width and height of cover")
ar = 'xMidYMid meet' if keep_aspect else 'none'
templ = CoverManager.SVG_TEMPLATE.replace('__ar__', ar)
templ = templ.replace('__viewbox__', '0 0 %d %d'%(width, height))
@ -157,6 +162,17 @@ def create_epub_cover(container, cover_path):
with container.open(titlepage, 'wb') as f:
f.write(raw)
# We have to make sure the raster cover item has id="cover" for the moron
# that wrote the Nook firmware
if raster_cover_item.get('id') != 'cover':
from calibre.ebooks.oeb.base import uuid_id
newid = uuid_id()
for item in container.opf_xpath('//*[@id="cover"]'):
item.set('id', newid)
for item in container.opf_xpath('//*[@idref="cover"]'):
item.set('idref', newid)
raster_cover_item.set('id', 'cover')
spine = container.opf_xpath('//opf:spine')[0]
ref = spine.makeelement(OPF('itemref'), idref=titlepage_item.get('id'))
container.insert_into_xml(spine, ref, index=0)
@ -171,11 +187,20 @@ def create_epub_cover(container, cover_path):
return raster_cover, titlepage
def remove_cover_image_in_page(container, page, cover_images):
for img in container.parsed(page).xpath('//*[local-name()="img" and @src]'):
href = img.get('src')
name = container.href_to_name(href, page)
if name in cover_images:
img.getparent.remove(img)
break
def set_epub_cover(container, cover_path, report):
cover_image = find_cover_image(container)
cover_page = find_cover_page(container)
wrapped_image = extra_cover_page = None
updated = False
log = container.log
possible_removals = set(clean_opf(container))
possible_removals
@ -190,6 +215,7 @@ def set_epub_cover(container, cover_path, report):
cover_page = candidate
if cover_page is not None:
log('Found existing cover page')
wrapped_image = find_cover_image_in_page(container, cover_page)
if len(spine_items) > 1:
@ -198,15 +224,22 @@ def set_epub_cover(container, cover_path, report):
if c != cover_page:
candidate = find_cover_image_in_page(container, c)
if candidate and candidate in {wrapped_image, cover_image}:
log('Found an extra cover page that is a simple wrapper, removing it')
# This page has only a single image and that image is the
# cover image, remove it.
container.remove_item(c)
extra_cover_page = c
spine_items = spine_items[:1] + spine_items[2:]
elif candidate is None:
# Remove the cover image if it is the first image in this
# page
remove_cover_image_in_page(container, c, {wrapped_image,
cover_image})
if wrapped_image is not None:
# The cover page is a simple wrapper around a single cover image,
# we can remove it safely.
log('Existing cover page is a simple wrapper, removing it')
container.remove_item(cover_page)
container.remove_item(wrapped_image)
updated = True

View File

@ -0,0 +1,75 @@
#!/usr/bin/env python
# vim:fileencoding=UTF-8:ts=4:sw=4:sta:et:sts=4:fdm=marker:ai
from __future__ import (unicode_literals, division, absolute_import,
print_function)
__license__ = 'GPL v3'
__copyright__ = '2013, Kovid Goyal <kovid at kovidgoyal.net>'
__docformat__ = 'restructuredtext en'
from calibre.customize.ui import output_profiles
from calibre.ebooks.conversion.config import load_defaults
from calibre.ebooks.oeb.base import XPath, OPF
from calibre.ebooks.oeb.polish.cover import find_cover_page
from calibre.ebooks.oeb.transforms.jacket import render_jacket as render
def render_jacket(mi):
ps = load_defaults('page_setup')
op = ps.get('output_profile', 'default')
opmap = {x.short_name:x for x in output_profiles()}
output_profile = opmap.get(op, opmap['default'])
return render(mi, output_profile)
def is_legacy_jacket(root):
return len(root.xpath(
'//*[starts-with(@class,"calibrerescale") and (local-name()="h1" or local-name()="h2")]')) > 0
def is_current_jacket(root):
return len(XPath(
'//h:meta[@name="calibre-content" and @content="jacket"]')(root)) > 0
def find_existing_jacket(container):
for item in container.spine_items:
name = container.abspath_to_name(item)
if container.book_type == 'azw3':
root = container.parsed(name)
if is_current_jacket(root):
return name
else:
if name.rpartition('/')[-1].startswith('jacket') and name.endswith('.xhtml'):
root = container.parsed(name)
if is_current_jacket(root) or is_legacy_jacket(root):
return name
def replace_jacket(container, name):
root = render_jacket(container.mi)
container.parsed_cache[name] = root
container.dirty(name)
def remove_jacket(container):
name = find_existing_jacket(container)
if name is not None:
container.remove_item(name)
return True
return False
def add_or_replace_jacket(container):
name = find_existing_jacket(container)
found = True
if name is None:
jacket_item = container.generate_item('jacket.xhtml', id_prefix='jacket')
name = container.href_to_name(jacket_item.get('href'), container.opf_name)
found = False
replace_jacket(container, name)
if not found:
# Insert new jacket into spine
index = 0
sp = container.abspath_to_name(container.spine_items.next())
if sp == find_cover_page(container):
index = 1
itemref = container.opf.makeelement(OPF('itemref'),
idref=jacket_item.get('id'))
container.insert_into_xml(container.opf_xpath('//opf:spine')[0], itemref,
index=index)
return found

View File

@ -15,12 +15,16 @@ from calibre.ebooks.oeb.polish.container import get_container
from calibre.ebooks.oeb.polish.stats import StatsCollector
from calibre.ebooks.oeb.polish.subset import subset_all_fonts
from calibre.ebooks.oeb.polish.cover import set_cover
from calibre.ebooks.oeb.polish.jacket import (
replace_jacket, add_or_replace_jacket, find_existing_jacket, remove_jacket)
from calibre.utils.logging import Log
ALL_OPTS = {
'subset': False,
'opf': None,
'cover': None,
'jacket': False,
'remove_jacket':False,
}
SUPPORTED = {'EPUB', 'AZW3'}
@ -38,8 +42,8 @@ changes needed for the desired effect.</p>
<p>You should use this tool as the last step in your ebook creation process.</p>
<p>Note that polishing only works on files in the <b>%s</b> formats.</p>
''')%_(' or ').join(SUPPORTED),
<p>Note that polishing only works on files in the %s formats.</p>
''')%_(' or ').join('<b>%s</b>'%x for x in SUPPORTED),
'subset': _('''\
<p>Subsetting fonts means reducing an embedded font to contain
@ -59,6 +63,15 @@ characters or completely removed.</p>
date you decide to add more text to your books, the newly added
text might not be covered by the subset font.</p>
'''),
'jacket': _('''\
<p>Insert a "book jacket" page at the start of the book that contains
all the book metadata such as title, tags, authors, series, comments,
etc.</p>'''),
'remove_jacket': _('''\
<p>Remove a previous inserted book jacket page.</p>
'''),
}
def hfix(name, raw):
@ -92,29 +105,54 @@ def polish(file_map, opts, log, report):
rt = lambda x: report('\n### ' + x)
st = time.time()
for inbook, outbook in file_map.iteritems():
report('Polishing: %s'%(inbook.rpartition('.')[-1].upper()))
report(_('## Polishing: %s')%(inbook.rpartition('.')[-1].upper()))
ebook = get_container(inbook, log)
jacket = None
if opts.subset:
stats = StatsCollector(ebook)
if opts.opf:
rt('Updating metadata')
rt(_('Updating metadata'))
update_metadata(ebook, opts.opf)
report('Metadata updated\n')
jacket = find_existing_jacket(ebook)
if jacket is not None:
replace_jacket(ebook, jacket)
report(_('Updated metadata jacket'))
report(_('Metadata updated\n'))
if opts.subset:
rt('Subsetting embedded fonts')
rt(_('Subsetting embedded fonts'))
subset_all_fonts(ebook, stats.font_stats, report)
report('')
if opts.cover:
rt('Setting cover')
rt(_('Setting cover'))
set_cover(ebook, opts.cover, report)
report('')
if opts.jacket:
rt(_('Inserting metadata jacket'))
if jacket is None:
if add_or_replace_jacket(ebook):
report(_('Existing metadata jacket replaced'))
else:
report(_('Metadata jacket inserted'))
else:
report(_('Existing metadata jacket replaced'))
report('')
if opts.remove_jacket:
rt(_('Removing metadata jacket'))
if remove_jacket(ebook):
report(_('Metadata jacket removed'))
else:
report(_('No metadata jacket found'))
report('')
ebook.commit(outbook)
report('Polishing took: %.1f seconds'%(time.time()-st))
report('-'*70)
report(_('Polishing took: %.1f seconds')%(time.time()-st))
REPORT = '{0} REPORT {0}'.format('-'*30)
@ -126,7 +164,7 @@ def gui_polish(data):
file_map = {x:x for x in files}
opts = ALL_OPTS.copy()
opts.update(data)
O = namedtuple('Options', ' '.join(data.iterkeys()))
O = namedtuple('Options', ' '.join(ALL_OPTS.iterkeys()))
opts = O(**opts)
log = Log(level=Log.DEBUG)
report = []
@ -135,6 +173,7 @@ def gui_polish(data):
log(REPORT)
for msg in report:
log(msg)
return '\n\n'.join(report)
def option_parser():
from calibre.utils.config import OptionParser
@ -149,6 +188,8 @@ def option_parser():
'If no cover is present, or the cover is not properly identified, inserts a new cover.'))
a('--opf', '-o', help=_(
'Path to an OPF file. The metadata in the book is updated from the OPF file.'))
o('--jacket', '-j', help=CLI_HELP['jacket'])
o('--remove-jacket', help=CLI_HELP['remove_jacket'])
o('--verbose', help=_('Produce more verbose output, useful for debugging.'))

View File

@ -11,7 +11,7 @@ from urlparse import urlparse
from cssutils import replaceUrls
from calibre import guess_type
from calibre.ebooks.oeb.polish.container import guess_type
from calibre.ebooks.oeb.base import (OEB_DOCS, OEB_STYLES, rewrite_links)
class LinkReplacer(object):
@ -41,7 +41,7 @@ class LinkReplacer(object):
return href
def replace_links(container, link_map, frag_map=lambda name, frag:frag):
ncx_type = guess_type('toc.ncx')[0]
ncx_type = guess_type('toc.ncx')
for name, media_type in container.mime_map.iteritems():
repl = LinkReplacer(name, container, link_map, frag_map)
if media_type.lower() in OEB_DOCS:

View File

@ -54,7 +54,10 @@ def subset_all_fonts(container, font_stats, report):
olen = sum(old_sizes.itervalues())
nlen = sum(new_sizes.itervalues())
total_new += len(nraw)
report('Decreased the font %s to %.1f%% of its original size'%
if nlen == olen:
report('The font %s was already subset'%font_name)
else:
report('Decreased the font %s to %.1f%% of its original size'%
(font_name, nlen/olen * 100))
f.seek(0), f.truncate(), f.write(nraw)

View File

@ -146,7 +146,7 @@ class MergeMetadata(object):
return item.id
self.remove_old_cover(item)
elif not cdata:
id = self.oeb.manifest.generate(id='cover')
id = self.oeb.manifest.generate(id='cover')[0]
self.oeb.manifest.add(id, old_cover.href, 'image/jpeg')
return id
if cdata:

View File

@ -117,8 +117,7 @@ class Split(object):
continue
page_breaks = list(page_breaks)
page_breaks.sort(cmp=
lambda x,y : cmp(int(x.get('pb_order')), int(y.get('pb_order'))))
page_breaks.sort(key=lambda x:int(x.get('pb_order')))
page_break_ids, page_breaks_ = [], []
for i, x in enumerate(page_breaks):
x.set('id', x.get('id', 'calibre_pb_%d'%i))
@ -235,7 +234,8 @@ class FlowSplitter(object):
for pattern, before in ordered_ids:
elem = pattern(tree)
if elem:
self.log.debug('\t\tSplitting on page-break')
self.log.debug('\t\tSplitting on page-break at %s'%
elem[0].get('id'))
before, after = self.do_split(tree, elem[0], before)
self.trees.append(before)
tree = after
@ -292,14 +292,9 @@ class FlowSplitter(object):
return npath
def do_split(self, tree, split_point, before):
'''
Split ``tree`` into a *before* and *after* tree at ``split_point``,
preserving tag structure, but not duplicating any text.
All tags that have had their text and tail
removed have the attribute ``calibre_split`` set to 1.
Split ``tree`` into a *before* and *after* tree at ``split_point``.
:param before: If True tree is split before split_point, otherwise after split_point
:return: before_tree, after_tree
@ -315,8 +310,9 @@ class FlowSplitter(object):
def nix_element(elem, top=True):
# Remove elem unless top is False in which case replace elem by its
# children
parent = elem.getparent()
index = parent.index(elem)
if top:
parent.remove(elem)
else:
@ -325,27 +321,38 @@ class FlowSplitter(object):
# Tree 1
hit_split_point = False
for elem in list(body.iterdescendants()):
keep_descendants = False
split_point_descendants = frozenset(split_point.iterdescendants())
for elem in tuple(body.iterdescendants()):
if elem is split_point:
hit_split_point = True
if before:
nix_element(elem)
else:
# We want to keep the descendants of the split point in
# Tree 1
keep_descendants = True
continue
if hit_split_point:
if keep_descendants:
if elem in split_point_descendants:
# elem is a descendant keep it
continue
else:
# We are out of split_point, so prevent further set
# lookups of split_point_descendants
keep_descendants = False
nix_element(elem)
# Tree 2
hit_split_point = False
for elem in list(body2.iterdescendants()):
for elem in tuple(body2.iterdescendants()):
if elem is split_point2:
hit_split_point = True
if not before:
nix_element(elem, top=False)
continue
if not hit_split_point:
nix_element(elem, top=False)
nix_element(elem)
break
nix_element(elem, top=False)
body2.text = '\n'
return tree, tree2
@ -478,8 +485,7 @@ class FlowSplitter(object):
def commit(self):
'''
Commit all changes caused by the split. This removes the previously
introduced ``calibre_split`` attribute and calculates an *anchor_map* for
Commit all changes caused by the split. Calculates an *anchor_map* for
all anchors in the original tree. Internal links are re-directed. The
original file is deleted and the split files are saved.
'''

View File

@ -7,30 +7,33 @@ __license__ = 'GPL v3'
__copyright__ = '2013, Kovid Goyal <kovid at kovidgoyal.net>'
__docformat__ = 'restructuredtext en'
import os, weakref, shutil
import os, weakref, shutil, textwrap
from collections import OrderedDict
from functools import partial
from PyQt4.Qt import (QDialog, QGridLayout, QIcon, QCheckBox, QLabel, QFrame,
QApplication, QDialogButtonBox, Qt, QSize, QSpacerItem,
QSizePolicy, QTimer)
QSizePolicy, QTimer, QModelIndex, QTextEdit,
QInputDialog, QMenu)
from calibre.gui2 import error_dialog, Dispatcher
from calibre.gui2 import error_dialog, Dispatcher, gprefs
from calibre.gui2.actions import InterfaceAction
from calibre.gui2.convert.metadata import create_opf_file
from calibre.gui2.dialogs.progress import ProgressDialog
from calibre.ptempfile import PersistentTemporaryDirectory
from calibre.utils.config_base import tweaks
class Polish(QDialog):
class Polish(QDialog): # {{{
def __init__(self, db, book_id_map, parent=None):
from calibre.ebooks.oeb.polish.main import HELP
QDialog.__init__(self, parent)
self.db, self.book_id_map = weakref.ref(db), book_id_map
self.setWindowIcon(QIcon(I('polish.png')))
self.setWindowTitle(ngettext(
'Polish book', _('Polish %d books')%len(book_id_map), len(book_id_map)))
title = _('Polish book')
if len(book_id_map) > 1:
title = _('Polish %d books')%len(book_id_map)
self.setWindowTitle(title)
self.help_text = {
'polish': _('<h3>About Polishing books</h3>%s')%HELP['about'],
@ -40,9 +43,13 @@ class Polish(QDialog):
'metadata':_('<h3>Updating metadata</h3>'
'<p>This will update all metadata and covers in the'
' ebook files to match the current metadata in the'
' calibre library.</p><p>Note that most ebook'
' calibre library.</p><p>If the ebook file does not have'
' an identifiable cover, a new cover is inserted.</p>'
' <p>Note that most ebook'
' formats are not capable of supporting all the'
' metadata in calibre.</p>'),
'jacket':_('<h3>Book Jacket</h3>%s')%HELP['jacket'],
'remove_jacket':_('<h3>Remove Book Jacket</h3>%s')%HELP['remove_jacket'],
}
self.l = l = QGridLayout()
@ -52,13 +59,18 @@ class Polish(QDialog):
l.addWidget(la, 0, 0, 1, 2)
count = 0
self.actions = OrderedDict([
self.all_actions = OrderedDict([
('subset', _('Subset all embedded fonts')),
('metadata', _('Update metadata in book files')),
('jacket', _('Add metadata as a "book jacket" page')),
('remove_jacket', _('Remove a previously inserted book jacket')),
])
for name, text in self.actions.iteritems():
prefs = gprefs.get('polishing_settings', {})
for name, text in self.all_actions.iteritems():
count += 1
x = QCheckBox(text, self)
x.setChecked(prefs.get(name, False))
x.stateChanged.connect(partial(self.option_toggled, name))
l.addWidget(x, count, 0, 1, 1)
setattr(self, 'opt_'+name, x)
la = QLabel(' <a href="#%s">%s</a>'%(name, _('About')))
@ -80,28 +92,100 @@ class Polish(QDialog):
l.addWidget(la, 0, 2, count+1, 1)
l.setColumnStretch(2, 1)
self.show_reports = sr = QCheckBox(_('Show &report'), self)
sr.setChecked(gprefs.get('polish_show_reports', True))
sr.setToolTip(textwrap.fill(_('Show a report of all the actions performed'
' after polishing is completed')))
l.addWidget(sr, count+1, 0, 1, 1)
self.bb = bb = QDialogButtonBox(QDialogButtonBox.Ok|QDialogButtonBox.Cancel)
bb.accepted.connect(self.accept)
bb.rejected.connect(self.reject)
l.addWidget(bb, count+1, 0, 1, -1)
self.save_button = sb = bb.addButton(_('&Save Settings'), bb.ActionRole)
sb.clicked.connect(self.save_settings)
self.load_button = lb = bb.addButton(_('&Load Settings'), bb.ActionRole)
self.load_menu = QMenu(lb)
lb.setMenu(self.load_menu)
self.all_button = b = bb.addButton(_('Select &all'), bb.ActionRole)
b.clicked.connect(partial(self.select_all, True))
self.none_button = b = bb.addButton(_('Select &none'), bb.ActionRole)
b.clicked.connect(partial(self.select_all, False))
l.addWidget(bb, count+1, 1, 1, -1)
self.setup_load_button()
self.resize(QSize(800, 600))
self.resize(QSize(950, 600))
def select_all(self, enable):
for action in self.all_actions:
x = getattr(self, 'opt_'+action)
x.blockSignals(True)
x.setChecked(enable)
x.blockSignals(False)
def save_settings(self):
if not self.something_selected:
return error_dialog(self, _('No actions selected'),
_('You must select at least one action before saving'),
show=True)
name, ok = QInputDialog.getText(self, _('Choose name'),
_('Choose a name for these settings'))
if ok:
name = unicode(name).strip()
if name:
settings = {ac:getattr(self, 'opt_'+ac).isChecked() for ac in
self.all_actions}
saved = gprefs.get('polish_settings', {})
saved[name] = settings
gprefs.set('polish_settings', saved)
self.setup_load_button()
def setup_load_button(self):
saved = gprefs.get('polish_settings', {})
m = self.load_menu
m.clear()
self.__actions = []
for name in sorted(saved):
self.__actions.append(m.addAction(name, partial(self.load_settings,
name)))
self.load_button.setEnabled(bool(saved))
def load_settings(self, name):
saved = gprefs.get('polish_settings', {}).get(name, {})
for action in self.all_actions:
checked = saved.get(action, False)
x = getattr(self, 'opt_'+action)
x.blockSignals(True)
x.setChecked(checked)
x.blockSignals(False)
def option_toggled(self, name, state):
if state == Qt.Checked:
self.help_label.setText(self.help_text[name])
def help_link_activated(self, link):
link = unicode(link)[1:]
self.help_label.setText(self.help_text[link])
@property
def something_selected(self):
for action in self.all_actions:
if getattr(self, 'opt_'+action).isChecked():
return True
return False
def accept(self):
self.actions = ac = {}
saved_prefs = {}
gprefs['polish_show_reports'] = bool(self.show_reports.isChecked())
something = False
for action in self.actions:
ac[action] = bool(getattr(self, 'opt_'+action).isChecked())
for action in self.all_actions:
ac[action] = saved_prefs[action] = bool(getattr(self, 'opt_'+action).isChecked())
if ac[action]:
something = True
if not something:
return error_dialog(self, _('No actions selected'),
_('You must select at least one action, or click Cancel.'),
show=True)
gprefs['polishing_settings'] = saved_prefs
self.queue_files()
return super(Polish, self).accept()
@ -131,6 +215,7 @@ class Polish(QDialog):
self.do_book(num, book_id, self.book_id_map[book_id])
except:
self.pd.reject()
raise
else:
self.pd.set_value(num)
QTimer.singleShot(0, self.do_one)
@ -156,13 +241,107 @@ class Polish(QDialog):
desc = ngettext(_('Polish %s')%mi.title,
_('Polish book %(nums)s of %(tot)s (%(title)s)')%dict(
num=num, tot=len(self.book_id_map),
nums=num, tot=len(self.book_id_map),
title=mi.title), len(self.book_id_map))
if hasattr(self, 'pd'):
self.pd.set_msg(_('Queueing book %(nums)s of %(tot)s (%(title)s)')%dict(
num=num, tot=len(self.book_id_map), title=mi.title))
nums=num, tot=len(self.book_id_map), title=mi.title))
self.jobs.append((desc, data, book_id, base))
# }}}
class Report(QDialog): # {{{
def __init__(self, parent):
QDialog.__init__(self, parent)
self.gui = parent
self.setAttribute(Qt.WA_DeleteOnClose, False)
self.setWindowIcon(QIcon(I('polish.png')))
self.reports = []
self.l = l = QGridLayout()
self.setLayout(l)
self.view = v = QTextEdit(self)
v.setReadOnly(True)
l.addWidget(self.view, 0, 0, 1, 2)
self.backup_msg = la = QLabel('')
l.addWidget(la, 1, 0, 1, 2)
la.setVisible(False)
la.setWordWrap(True)
self.ign_msg = _('Ignore remaining %d reports')
self.ign = QCheckBox(self.ign_msg, self)
l.addWidget(self.ign, 2, 0)
bb = self.bb = QDialogButtonBox(QDialogButtonBox.Close)
bb.accepted.connect(self.accept)
bb.rejected.connect(self.reject)
b = self.log_button = bb.addButton(_('View full &log'), bb.ActionRole)
b.clicked.connect(self.view_log)
bb.button(bb.Close).setDefault(True)
l.addWidget(bb, 2, 1)
self.finished.connect(self.show_next, type=Qt.QueuedConnection)
self.resize(QSize(800, 600))
def setup_ign(self):
self.ign.setText(self.ign_msg%len(self.reports))
self.ign.setVisible(bool(self.reports))
self.ign.setChecked(False)
def __call__(self, *args):
self.reports.append(args)
self.setup_ign()
if not self.isVisible():
self.show_next()
def show_report(self, book_title, book_id, fmts, job, report):
from calibre.ebooks.markdown.markdown import markdown
self.current_log = job.details
self.setWindowTitle(_('Polishing of %s')%book_title)
self.view.setText(markdown('# %s\n\n'%book_title + report,
output_format='html4'))
self.bb.button(self.bb.Close).setFocus(Qt.OtherFocusReason)
self.backup_msg.setVisible(bool(fmts))
if fmts:
m = ngettext('The original file has been saved as %s.',
'The original files have been saved as %s.', len(fmts))%(
_(' and ').join('ORIGINAL_'+f for f in fmts)
)
self.backup_msg.setText(m + ' ' + _(
'If you polish again, the polishing will run on the originals.')%(
))
def view_log(self):
self.view.setPlainText(self.current_log)
self.view.verticalScrollBar().setValue(0)
def show_next(self, *args):
if not self.reports:
return
if not self.isVisible():
self.show()
self.show_report(*self.reports.pop(0))
self.setup_ign()
def accept(self):
if self.ign.isChecked():
self.reports = []
if self.reports:
self.show_next()
return
super(Report, self).accept()
def reject(self):
if self.ign.isChecked():
self.reports = []
if self.reports:
self.show_next()
return
super(Report, self).reject()
# }}}
class PolishAction(InterfaceAction):
@ -173,6 +352,7 @@ class PolishAction(InterfaceAction):
def genesis(self):
self.qaction.triggered.connect(self.polish_books)
self.report = Report(self.gui)
def location_selected(self, loc):
enabled = loc == 'library'
@ -213,21 +393,28 @@ class PolishAction(InterfaceAction):
return
d = Polish(self.gui.library_view.model().db, book_id_map, parent=self.gui)
if d.exec_() == d.Accepted and d.jobs:
for desc, data, book_id, base, files in reversed(d.jobs):
show_reports = bool(d.show_reports.isChecked())
for desc, data, book_id, base in reversed(d.jobs):
job = self.gui.job_manager.run_job(
Dispatcher(self.book_polished), 'gui_polish', args=(data,),
description=desc)
job.polish_args = (book_id, base, data['files'])
job.polish_args = (book_id, base, data['files'], show_reports)
if d.jobs:
self.gui.jobs_pointer.start()
self.gui.status_bar.show_message(
_('Start polishing of %d book(s)') % len(d.jobs), 2000)
def book_polished(self, job):
if job.failed:
self.gui.job_exception(job)
return
db = self.gui.current_db
book_id, base, files = job.polish_args
book_id, base, files, show_reports = job.polish_args
fmts = set()
for path in files:
fmt = path.rpartition('.')[-1].upper()
if tweaks['save_original_format']:
if tweaks['save_original_format_when_polishing']:
fmts.add(fmt)
db.save_original_format(book_id, fmt, notify=False)
with open(path, 'rb') as f:
db.add_format(book_id, fmt, f, index_is_id=True)
@ -239,6 +426,13 @@ class PolishAction(InterfaceAction):
os.rmdir(parent)
except:
pass
self.gui.tags_view.recount()
if self.gui.current_view() is self.gui.library_view:
current = self.gui.library_view.currentIndex()
if current.isValid():
self.gui.library_view.model().current_changed(current, QModelIndex())
if show_reports:
self.report(db.title(book_id, index_is_id=True), book_id, fmts, job, job.result)
if __name__ == '__main__':
app = QApplication([])

View File

@ -283,6 +283,7 @@ class SchedulerDialog(QDialog, Ui_Dialog):
urn = self.current_urn
if urn is not None:
self.initialize_detail_box(urn)
self.recipes.scrollTo(current)
def accept(self):
if not self.commit():

View File

@ -323,6 +323,8 @@ def run_gui(opts, args, actions, listener, app, gui_debug=None):
app = os.path.dirname(os.path.dirname(sys.frameworks_dir))
subprocess.Popen('sleep 3s; open '+app, shell=True)
else:
if iswindows and hasattr(winutil, 'prepare_for_restart'):
winutil.prepare_for_restart()
subprocess.Popen([e] + sys.argv[1:])
else:
if iswindows:

View File

@ -257,7 +257,7 @@ class ConditionEditor(QWidget): # {{{
'Zero is today. Dates in the past always match')
elif action == 'newer future days':
self.value_box.setValidator(QIntValidator(self.value_box))
tt = _('Enter the mimimum days in the future the item can be. '
tt = _('Enter the minimum days in the future the item can be. '
'Zero is today. Dates in the past never match')
else:
self.value_box.setInputMask('9999-99-99')

View File

@ -1,10 +1,10 @@
# -*- coding: utf-8 -*-
from __future__ import (unicode_literals, division, absolute_import, print_function)
store_version = 1 # Needed for dynamic plugin loading
store_version = 2 # Needed for dynamic plugin loading
__license__ = 'GPL 3'
__copyright__ = '2011-2012, Tomasz Długosz <tomek3d@gmail.com>'
__copyright__ = '2011-2013, Tomasz Długosz <tomek3d@gmail.com>'
__docformat__ = 'restructuredtext en'
import re
@ -51,7 +51,7 @@ class EmpikStore(BasicStoreConfig, StorePlugin):
if not id:
continue
cover_url = ''.join(data.xpath('.//div[@class="productBox-450Pic"]/a/img/@src'))
cover_url = ''.join(data.xpath('.//div[@class="productBox-450Pic"]/a/img/@data-original'))
title = ''.join(data.xpath('.//a[@class="productBox-450Title"]/text()'))
title = re.sub(r' \(ebook\)', '', title)
author = ''.join(data.xpath('.//div[@class="productBox-450Author"]/a/text()'))

View File

@ -1,10 +1,10 @@
# -*- coding: utf-8 -*-
from __future__ import (unicode_literals, division, absolute_import, print_function)
store_version = 1 # Needed for dynamic plugin loading
store_version = 2 # Needed for dynamic plugin loading
__license__ = 'GPL 3'
__copyright__ = '2011, Tomasz Długosz <tomek3d@gmail.com>'
__copyright__ = '2011-2013, Tomasz Długosz <tomek3d@gmail.com>'
__docformat__ = 'restructuredtext en'
import re
@ -61,8 +61,6 @@ class LegimiStore(BasicStoreConfig, StorePlugin):
cover_url = ''.join(data.xpath('.//img[1]/@src'))
title = ''.join(data.xpath('.//span[@class="bookListTitle ellipsis"]/text()'))
author = ''.join(data.xpath('.//span[@class="bookListAuthor ellipsis"]/text()'))
author = re.sub(',','',author)
author = re.sub(';',',',author)
price = ''.join(data.xpath('.//div[@class="bookListPrice"]/span/text()'))
formats = []
with closing(br.open(id.strip(), timeout=timeout/4)) as nf:

View File

@ -1,10 +1,10 @@
# -*- coding: utf-8 -*-
from __future__ import (unicode_literals, division, absolute_import, print_function)
store_version = 1 # Needed for dynamic plugin loading
store_version = 2 # Needed for dynamic plugin loading
__license__ = 'GPL 3'
__copyright__ = '2011-2012, Tomasz Długosz <tomek3d@gmail.com>'
__copyright__ = '2011-2013, Tomasz Długosz <tomek3d@gmail.com>'
__docformat__ = 'restructuredtext en'
import re
@ -66,6 +66,8 @@ class NextoStore(BasicStoreConfig, StorePlugin):
price = ''.join(data.xpath('.//strong[@class="nprice"]/text()'))
cover_url = ''.join(data.xpath('.//img[@class="cover"]/@src'))
cover_url = re.sub(r'%2F', '/', cover_url)
cover_url = re.sub(r'\widthMax=120&heightMax=200', 'widthMax=64&heightMax=64', cover_url)
title = ''.join(data.xpath('.//a[@class="title"]/text()'))
title = re.sub(r' - ebook$', '', title)
formats = ', '.join(data.xpath('.//ul[@class="formats_available"]/li//b/text()'))
@ -80,7 +82,7 @@ class NextoStore(BasicStoreConfig, StorePlugin):
counter -= 1
s = SearchResult()
s.cover_url = cover_url
s.cover_url = 'http://www.nexto.pl' + cover_url
s.title = title.strip()
s.author = author.strip()
s.price = price

View File

@ -1,10 +1,10 @@
# -*- coding: utf-8 -*-
from __future__ import (unicode_literals, division, absolute_import, print_function)
store_version = 1 # Needed for dynamic plugin loading
store_version = 2 # Needed for dynamic plugin loading
__license__ = 'GPL 3'
__copyright__ = '2011, Tomasz Długosz <tomek3d@gmail.com>'
__copyright__ = '2011-2013, Tomasz Długosz <tomek3d@gmail.com>'
__docformat__ = 'restructuredtext en'
import re
@ -55,10 +55,10 @@ class VirtualoStore(BasicStoreConfig, StorePlugin):
continue
price = ''.join(data.xpath('.//span[@class="price"]/text() | .//span[@class="price abbr"]/text()'))
cover_url = ''.join(data.xpath('.//div[@class="list_middle_left"]//a/img/@src'))
cover_url = ''.join(data.xpath('.//div[@class="list_middle_left"]//a//img/@src'))
title = ''.join(data.xpath('.//div[@class="list_title list_text_left"]/a/text()'))
author = ', '.join(data.xpath('.//div[@class="list_authors list_text_left"]/a/text()'))
formats = [ form.split('_')[-1].replace('.png', '') for form in data.xpath('.//div[@style="width:55%;float:left;text-align:left;height:18px;"]//img/@src')]
formats = [ form.split('_')[-1].replace('.png', '') for form in data.xpath('.//div[@style="width:55%;float:left;text-align:left;height:18px;"]//a/img/@src')]
nodrm = no_drm_pattern.search(''.join(data.xpath('.//div[@style="width:45%;float:right;text-align:right;height:18px;"]/div/div/text()')))
counter -= 1

View File

@ -1,7 +1,7 @@
__license__ = 'GPL v3'
__copyright__ = '2008, Kovid Goyal <kovid at kovidgoyal.net>'
import traceback, os, sys, functools, collections, textwrap
import traceback, os, sys, functools, textwrap
from functools import partial
from threading import Thread
@ -48,39 +48,57 @@ class Worker(Thread):
self.exception = err
self.traceback = traceback.format_exc()
class History(collections.deque):
class History(list):
def __init__(self, action_back, action_forward):
self.action_back = action_back
self.action_forward = action_forward
collections.deque.__init__(self)
self.pos = 0
super(History, self).__init__(self)
self.insert_pos = 0
self.back_pos = None
self.forward_pos = None
self.set_actions()
def set_actions(self):
self.action_back.setDisabled(self.pos < 1)
self.action_forward.setDisabled(self.pos + 1 >= len(self))
self.action_back.setDisabled(self.back_pos is None)
self.action_forward.setDisabled(self.forward_pos is None)
def back(self, from_pos):
if self.pos - 1 < 0: return None
if self.pos == len(self):
self.append([])
self[self.pos] = from_pos
self.pos -= 1
# Back clicked
if self.back_pos is None:
return None
item = self[self.back_pos]
self.forward_pos = self.back_pos+1
if self.forward_pos >= len(self):
self.append(from_pos)
self.forward_pos = len(self) - 1
self.insert_pos = self.forward_pos
self.back_pos = None if self.back_pos == 0 else self.back_pos - 1
self.set_actions()
return self[self.pos]
return item
def forward(self):
if self.pos + 1 >= len(self): return None
self.pos += 1
def forward(self, from_pos):
if self.forward_pos is None:
return None
item = self[self.forward_pos]
self.back_pos = self.forward_pos - 1
if self.back_pos < 0: self.back_pos = None
self.insert_pos = self.back_pos or 0
self.forward_pos = None if self.forward_pos > len(self) - 2 else self.forward_pos + 1
self.set_actions()
return self[self.pos]
return item
def add(self, item):
while len(self) > self.pos+1:
self.pop()
self.append(item)
self.pos += 1
self[self.insert_pos:] = []
while self.insert_pos > 0 and self[self.insert_pos-1] == item:
self.insert_pos -= 1
self[self.insert_pos:] = []
self.insert(self.insert_pos, item)
# The next back must go to item
self.back_pos = self.insert_pos
self.insert_pos += 1
# There can be no forward
self.forward_pos = None
self.set_actions()
class Metadata(QLabel):
@ -665,7 +683,7 @@ class EbookViewer(MainWindow, Ui_EbookViewer):
self.goto_page(num)
def forward(self, x):
pos = self.history.forward()
pos = self.history.forward(self.pos.value())
if pos is not None:
self.goto_page(pos)

View File

@ -255,17 +255,11 @@ class PocketBookPro912(PocketBook):
class iPhone(Device):
name = 'iPhone/iTouch'
name = 'iPhone/iPad/iPod Touch'
output_format = 'EPUB'
manufacturer = 'Apple'
id = 'iphone'
supports_color = True
output_profile = 'ipad'
class iPad(iPhone):
name = 'iPad'
id = 'ipad'
output_profile = 'ipad3'
class Android(Device):

View File

@ -6,8 +6,8 @@
<rect>
<x>0</x>
<y>0</y>
<width>400</width>
<height>300</height>
<width>530</width>
<height>316</height>
</rect>
</property>
<property name="windowTitle">
@ -23,7 +23,7 @@
<item>
<widget class="QLabel" name="label">
<property name="text">
<string>&lt;p&gt;If you use the &lt;a href=&quot;http://www.lexcycle.com/download&quot;&gt;Stanza&lt;/a&gt; e-book app on your iPhone/iTouch, you can access your calibre book collection directly on the device. To do this you have to turn on the calibre content server.</string>
<string>&lt;p&gt;If you use the &lt;a href=&quot;http://www.lexcycle.com/download&quot;&gt;Stanza&lt;/a&gt; or &lt;a href=&quot;http://marvinapp.com/&quot;&gt;Marvin&lt;/a&gt; e-book reading app on your Apple iDevice, you can access your calibre book collection wirelessly, directly on the device. To do this you have to turn on the calibre content server.</string>
</property>
<property name="wordWrap">
<bool>true</bool>
@ -70,7 +70,7 @@
<widget class="QLabel" name="instructions">
<property name="text">
<string>&lt;p&gt;Remember to leave calibre running as the server only runs as long as calibre is running.
&lt;p&gt;Stanza should see your calibre collection automatically. If not, try adding the URL http://myhostname:8080 as a new catalog in the Stanza reader on your iPhone. Here myhostname should be the fully qualified hostname or the IP address of the computer calibre is running on.</string>
&lt;p&gt;The reader app should see your calibre collection automatically. If not, try adding the URL http://myhostname:8080 as a new catalog in the reader on your iDevice. Here myhostname should be the fully qualified hostname or the IP address of the computer calibre is running on. See &lt;a href=&quot;http://manual.calibre-ebook.com/faq.html#how-do-i-use-app-with-my-ipad-iphone-ipod-touch&quot;&gt;the User Manual&lt;/a&gt; for more information.</string>
</property>
<property name="wordWrap">
<bool>true</bool>

View File

@ -250,8 +250,12 @@ class PostInstall:
f.write('# calibre Bash Shell Completion\n')
f.write(opts_and_exts('calibre', guiop, BOOK_EXTENSIONS))
f.write(opts_and_exts('lrf2lrs', lrf2lrsop, ['lrf']))
f.write(opts_and_exts('ebook-meta', metaop, list(meta_filetypes())))
f.write(opts_and_exts('ebook-polish', polish_op, [x.lower() for x in SUPPORTED]))
f.write(opts_and_exts('ebook-meta', metaop,
list(meta_filetypes()), cover_opts=['--cover', '-c'],
opf_opts=['--to-opf', '--from-opf']))
f.write(opts_and_exts('ebook-polish', polish_op,
[x.lower() for x in SUPPORTED], cover_opts=['--cover', '-c'],
opf_opts=['--opf', '-o']))
f.write(opts_and_exts('lrfviewer', lrfviewerop, ['lrf']))
f.write(opts_and_exts('ebook-viewer', viewer_op, input_formats))
f.write(opts_and_words('fetch-ebook-metadata', fem_op, []))
@ -478,11 +482,23 @@ def opts_and_words(name, op, words):
complete -F _'''%(opts, words) + fname + ' ' + name +"\n\n").encode('utf-8')
def opts_and_exts(name, op, exts):
def opts_and_exts(name, op, exts, cover_opts=('--cover',), opf_opts=()):
opts = ' '.join(options(op))
exts.extend([i.upper() for i in exts])
exts='|'.join(exts)
fname = name.replace('-', '_')
special_exts_template = '''\
%s )
_filedir %s
return 0
;;
'''
extras = []
for eopts, eexts in ((cover_opts, "${pics}"), (opf_opts, "'@(opf)'")):
for opt in eopts:
extras.append(special_exts_template%(opt, eexts))
extras = '\n'.join(extras)
return '_'+fname+'()'+\
'''
{
@ -490,33 +506,28 @@ def opts_and_exts(name, op, exts):
COMPREPLY=()
cur="${COMP_WORDS[COMP_CWORD]}"
prev="${COMP_WORDS[COMP_CWORD-1]}"
opts="%s"
opts="%(opts)s"
pics="@(jpg|jpeg|png|gif|bmp|JPG|JPEG|PNG|GIF|BMP)"
case "${prev}" in
--cover )
_filedir "${pics}"
return 0
;;
%(extras)s
esac
case "${cur}" in
--cover )
_filedir "${pics}"
return 0
;;
%(extras)s
-* )
COMPREPLY=( $(compgen -W "${opts}" -- ${cur}) )
return 0
;;
* )
_filedir '@(%s)'
_filedir '@(%(exts)s)'
return 0
;;
esac
}
complete -o filenames -F _'''%(opts,exts) + fname + ' ' + name +"\n\n"
complete -o filenames -F _'''%dict(
opts=opts, extras=extras, exts=exts) + fname + ' ' + name +"\n\n"
VIEWER = '''\

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

Some files were not shown because too many files have changed in this diff Show More