sync with Kovid's branch

This commit is contained in:
Tomasz Długosz 2012-11-15 21:17:39 +01:00
commit 3e4e6099a5
337 changed files with 39935 additions and 51331 deletions

View File

@ -35,3 +35,49 @@ nbproject/
.settings/
*.DS_Store
calibre_plugins/
recipes/.git
recipes/.gitignore
recipes/README
recipes/katalog_egazeciarz.recipe
recipes/tv_axnscifi.recipe
recipes/tv_comedycentral.recipe
recipes/tv_discoveryscience.recipe
recipes/tv_foxlife.recipe
recipes/tv_fox.recipe
recipes/tv_hbo.recipe
recipes/tv_kinopolska.recipe
recipes/tv_nationalgeographic.recipe
recipes/tv_polsat2.recipe
recipes/tv_polsat.recipe
recipes/tv_tv4.recipe
recipes/tv_tvn7.recipe
recipes/tv_tvn.recipe
recipes/tv_tvp1.recipe
recipes/tv_tvp2.recipe
recipes/tv_tvphd.recipe
recipes/tv_tvphistoria.recipe
recipes/tv_tvpkultura.recipe
recipes/tv_tvppolonia.recipe
recipes/tv_tvpuls.recipe
recipes/tv_viasathistory.recipe
recipes/icons/tv_axnscifi.png
recipes/icons/tv_comedycentral.png
recipes/icons/tv_discoveryscience.png
recipes/icons/tv_foxlife.png
recipes/icons/tv_fox.png
recipes/icons/tv_hbo.png
recipes/icons/tv_kinopolska.png
recipes/icons/tv_nationalgeographic.png
recipes/icons/tv_polsat2.png
recipes/icons/tv_polsat.png
recipes/icons/tv_tv4.png
recipes/icons/tv_tvn7.png
recipes/icons/tv_tvn.png
recipes/icons/tv_tvp1.png
recipes/icons/tv_tvp2.png
recipes/icons/tv_tvphd.png
recipes/icons/tv_tvphistoria.png
recipes/icons/tv_tvpkultura.png
recipes/icons/tv_tvppolonia.png
recipes/icons/tv_tvpuls.png
recipes/icons/tv_viasathistory.png

View File

@ -47,12 +47,6 @@ License: Apache 2.0
The full text of the Apache 2.0 license is available at:
http://www.apache.org/licenses/LICENSE-2.0
Files: src/sfntly/*
Copyright: Google Inc.
License: Apache 2.0
The full text of the Apache 2.0 license is available at:
http://www.apache.org/licenses/LICENSE-2.0
Files: resources/viewer/mathjax/*
Copyright: Unknown
License: Apache 2.0

View File

@ -19,6 +19,110 @@
# new recipes:
# - title:
- version: 0.9.6
date: 2012-11-10
new features:
- title: "Experimental support for subsetting fonts"
description: "Subsetting a font means reducing the font to contain only the glyphs for the text actually present in the book. This can easily halve the size of the font. calibre can now do this for all embedded fonts during a conversion. Turn it on via the 'Subset all embedded fonts' option under the Look & Feel section of the conversion dialog. calibre can subset both TrueType and OpenType fonts. Note that this code is very new and likely has bugs, so please check the output if you turn on subsetting. The conversion log will have info about the subsetting operations."
type: major
- title: "EPUB Input: Try to workaround EPUBs that have missing or damaged ZIP central directories. calibre should now be able to read/convert such an EPUB file, provided it does not suffer from further corruption."
- title: "Allow using identifiers in save to disk templates."
tickets: [1074623]
- title: "calibredb: Add an option to not notify the GUI"
- title: "Catalogs: Fix long tags causing catalog generation to fail on windows. Add the ability to cross-reference authors, i.e. to relist the authors for a book with multiple authors separately."
tickets: [1074931]
- title: "Edit metadata dialog: Add a clear tags button to remove all tags with a single click"
- title: "Add search to the font family chooser dialog"
bug fixes:
- title: "Windows: Fix a long standing bug in the device eject code that for some reason only manifested in 0.9.5."
tickets: [1075782]
- title: "Get Books: Fix Amazon stores, Google Books store and libri.de"
- title: "Kobo driver: More fixes for on device book matching, and list books as being on device even if the Kobo has not yet indexed them. Also some performance improvements."
tickets: [1069617]
- title: "EPUB Output: Remove duplicate id and name attributes to eliminate pointless noise from the various epub check utilities"
- title: "Ask for confirmation before removing plugins"
- title: "Fix bulk convert queueing dialog becoming very long if any of the books have a very long title."
tickets: [1076191]
- title: "Fix deleting custom column tags like data from the Tag browser not updating the last modified timestamp for affected books"
tickets: [1075476]
- title: "When updating a previously broken plugin, do not show an error message because the previous version of the plugin cannot be loaded"
- title: "Fix regression that broke the Template Editor"
improved recipes:
- Various updated Polish recipes
- London Review of Books
- Yemen Times
new recipes:
- title: "Various Polish news sources"
author: Artur Stachecki
- version: 0.9.5
date: 2012-11-02
new features:
- title: "Font embedding: Add support for the CSS 3 Fonts module, which means you can embed font families that have more that the usual four faces, with the full set of font-stretch and font-weight variations. Of course, whether the fonts actually show up on a reader will depend on the readers' support for CSS 3."
- title: "Sharing by email: Allow specifying an 'alias' or friendly name by which to identify each email recipient."
tickets: [1069076]
- title: "Embedding fonts: Allow adding ttf/otf font files to calibre directly to be used for embedding. That way the fonts do not have to be installed system wide. You can add a font to calibre via the 'Add fonts' button in the font chooser dialog for embedding fonts."
- title: "E-book viewer: Add the ability to rotate images to the popup image viewer."
tickets: [1073513]
- title: "Generate cover: Speedup searching the system for a font that can render special characters"
- title: "A new custom font scanner to locate all fonts on the system. Faster and less crash prone that fontconfig/freetype"
- title: "Font family chooser: Show the faces available for a family when clicking on the family"
bug fixes:
- title: "Get Books: Fix eHarlequin and Kobo stores."
tickets: [1072702]
- title: "Kobo driver: Fix a bug that could cause the on device book matching to fail in certain circumstances."
tickets: [1072437]
- title: "Kobo driver: When using a SD card do not delete shelves that contain on books on the card (there might be books in the shelf in the main memory)."
tickets: [1073792]
- title: "Workaround for bug in the windows API CreateHardLink function that breaks using calibre libraries on some networked filesystems."
- title: "Template editor: Use dummy metadata instead of blank/unknown values"
- title: "Windows: abort setting of title/author if any of the books' files are in use. Results in less surprising behavior than before, when the title/author would be changed, but the on disk location would not."
improved recipes:
- Financial Times UK
- Science AAAS
- The Atlantic
new recipes:
- title: "Pravda in english, italian and portuguese"
author: Darko Miletic
- title: "Delco Times"
author: Krittika Goyal
- version: 0.9.4
date: 2012-10-26

View File

@ -72,13 +72,21 @@ After installing Bazaar, you can get the |app| source code with the command::
bzr branch lp:calibre
On Windows you will need the complete path name, that will be something like :file:`C:\\Program Files\\Bazaar\\bzr.exe`. To update a branch
to the latest code, use the command::
On Windows you will need the complete path name, that will be something like :file:`C:\\Program Files\\Bazaar\\bzr.exe`.
To update a branch to the latest code, use the command::
bzr merge
The calibre repository is huge so the branch operation above takes along time (about an hour). If you want to get the code faster, the sourcecode for the latest release is always available as an
`archive <http://status.calibre-ebook.com/dist/src>`_.
|app| is a very large project with a very long source control history, so the
above can take a while (10mins to an hour depending on your internet speed).
If you want to get the code faster, the sourcecode for the latest release is
always available as an `archive <http://status.calibre-ebook.com/dist/src>`_.
You can also use bzr to just download the source code, without the history,
using::
bzr branch --stacked lp:calibre
Submitting your changes to be included
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@ -109,7 +117,7 @@ Whenever you commit changes to your branch with the command::
bzr commit -m "Comment describing your change"
Kovid can merge it directly from your branch into the main |app| source tree. You should also keep an eye on the |app|
`development forum <http://www.mobileread.com/forums/forumdisplay.php?f=240>`. Before making major changes, you should
`development forum <http://www.mobileread.com/forums/forumdisplay.php?f=240>`_. Before making major changes, you should
discuss them in the forum or contact Kovid directly (his email address is all over the source code).
Windows development environment

View File

@ -327,9 +327,8 @@ You can browse your |app| collection on your Android device is by using the
calibre content server, which makes your collection available over the net.
First perform the following steps in |app|
* Set the :guilabel:`Preferred Output Format` in |app| to EPUB (The output format can be set under :guilabel:`Preferences->Interface->Behavior`)
* Set the output profile to Tablet (this will work for phones as well), under :guilabel:`Preferences->Conversion->Common Options->Page Setup`
* Convert the books you want to read on your device to EPUB format by selecting them and clicking the Convert button.
* Set the :guilabel:`Preferred Output Format` in |app| to EPUB for normal Android devices or MOBI for Kindles (The output format can be set under :guilabel:`Preferences->Interface->Behavior`)
* Convert the books you want to read on your device to EPUB/MOBI format by selecting them and clicking the Convert button.
* Turn on the Content Server in |app|'s preferences and leave |app| running.
Now on your Android device, open the browser and browse to
@ -650,17 +649,24 @@ If it still wont launch, start a command prompt (press the windows key and R; th
Post any output you see in a help message on the `Forum <http://www.mobileread.com/forums/forumdisplay.php?f=166>`_.
|app| freezes when I click on anything?
|app| freezes/crashes occasionally?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
There are three possible things I know of, that can cause this:
* You recently connected an external monitor or TV to your computer. In this case, whenever |app| opens a new window like the edit metadata window or the conversion dialog, it appears on the second monitor where you dont notice it and so you think |app| has frozen. Disconnect your second monitor and restart calibre.
* You recently connected an external monitor or TV to your computer. In
this case, whenever |app| opens a new window like the edit metadata
window or the conversion dialog, it appears on the second monitor where
you dont notice it and so you think |app| has frozen. Disconnect your
second monitor and restart calibre.
* You are using a Wacom branded mouse. There is an incompatibility between Wacom mice and the graphics toolkit |app| uses. Try using a non-Wacom mouse.
* Sometimes if some software has installed lots of new files in your fonts folder, |app| can crash until it finishes indexing them. Just start |app|, then leave it alone for about 20 minutes, without clicking on anything. After that you should be able to use |app| as normal.
* You are using a Wacom branded mouse. There is an incompatibility between
Wacom mice and the graphics toolkit |app| uses. Try using a non-Wacom
mouse.
* If you use RoboForm, it is known to cause |app| to crash. Add |app| to
the blacklist of programs inside RoboForm to fix this. Or uninstall
RoboForm.
|app| is not starting on OS X?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@ -719,8 +725,8 @@ You can switch |app| to using a backed up library folder by simply clicking the
If you want to backup the |app| configuration/plugins, you have to backup the config directory. You can find this config directory via :guilabel:`Preferences->Miscellaneous`. Note that restoring configuration directories is not officially supported, but should work in most cases. Just copy the contents of the backup directory into the current configuration directory to restore.
How do I use purchased EPUB books with |app|?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
How do I use purchased EPUB books with |app| (or what do I do with .acsm files)?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Most purchased EPUB books have `DRM <http://drmfree.calibre-ebook.com/about#drm>`_. This prevents |app| from opening them. You can still use |app| to store and transfer them to your ebook reader. First, you must authorize your reader on a windows machine with Adobe Digital Editions. Once this is done, EPUB books transferred with |app| will work fine on your reader. When you purchase an epub book from a website, you will get an ".acsm" file. This file should be opened with Adobe Digital Editions, which will then download the actual ".epub" ebook. The ebook file will be stored in the folder "My Digital Editions", from where you can add it to |app|.
I am getting a "Permission Denied" error?

30
recipes/autosport.recipe Normal file
View File

@ -0,0 +1,30 @@
#!/usr/bin/env python
__license__ = 'GPL v3'
__author__ = 'MrStefan <mrstefaan@gmail.com>'
'''
www.autosport.com
'''
from calibre.web.feeds.news import BasicNewsRecipe
class autosport(BasicNewsRecipe):
title = u'Autosport'
__author__ = 'MrStefan <mrstefaan@gmail.com>'
language = 'en_GB'
description =u'Daily Formula 1 and motorsport news from the leading weekly motor racing magazine. The authority on Formula 1, F1, MotoGP, GP2, Champ Car, Le Mans...'
masthead_url='http://cdn.images.autosport.com/asdotcom.gif'
remove_empty_feeds= True
oldest_article = 1
max_articles_per_feed = 100
remove_javascript=True
no_stylesheets=True
keep_only_tags =[]
keep_only_tags.append(dict(name = 'h1', attrs = {'class' : 'news_headline'}))
keep_only_tags.append(dict(name = 'td', attrs = {'class' : 'news_article_author'}))
keep_only_tags.append(dict(name = 'td', attrs = {'class' : 'news_article_date'}))
keep_only_tags.append(dict(name = 'p'))
feeds = [(u'ALL NEWS', u'http://www.autosport.com/rss/allnews.xml')]

28
recipes/blognexto.recipe Normal file
View File

@ -0,0 +1,28 @@
from calibre.web.feeds.news import BasicNewsRecipe
class blognexto(BasicNewsRecipe):
title = 'BLOG.NEXTO.pl'
__author__ = 'MrStefan <mrstefaan@gmail.com>'
language = 'pl'
description ='o e-publikacjach prawie wszystko'
masthead_url='http://blog.nexto.pl/wp-content/uploads/2012/04/logo-blog-nexto.pl_.jpg'
remove_empty_feeds= True
oldest_article = 7
max_articles_per_feed = 100
remove_javascript=True
no_stylesheets=True
keep_only_tags =[]
keep_only_tags.append(dict(name = 'div', attrs = {'id' : 'content'}))
remove_tags =[]
remove_tags.append(dict(name = 'div', attrs = {'class' : 'comment-cloud'}))
remove_tags.append(dict(name = 'p', attrs = {'class' : 'post-date1'}))
remove_tags.append(dict(name = 'div', attrs = {'class' : 'fb-like'}))
remove_tags.append(dict(name = 'div', attrs = {'class' : 'tags'}))
remove_tags.append(dict(name = 'div', attrs = {'class' : 'postnavi'}))
remove_tags.append(dict(name = 'div', attrs = {'class' : 'commments-box'}))
remove_tags.append(dict(name = 'div', attrs = {'id' : 'respond'}))
feeds = [('Artykuly', 'http://feeds.feedburner.com/blognexto')]

140
recipes/brewiarz.recipe Normal file
View File

@ -0,0 +1,140 @@
#!/usr/bin/env python
__license__ = 'GPL v3'
from calibre.web.feeds.news import BasicNewsRecipe
import datetime, re
class brewiarz(BasicNewsRecipe):
title = u'Brewiarz'
__author__ = 'Artur Stachecki <artur.stachecki@gmail.com>'
language = 'pl'
description = u'Serwis poświęcony Liturgii Godzin (brewiarzowi) - formie codziennej modlitwy Kościoła katolickiego.'
masthead_url = 'http://brewiarz.pl/images/logo2.gif'
max_articles_per_feed = 100
remove_javascript = True
no_stylesheets = True
publication_type = 'newspaper'
next_days = 1
def parse_index(self):
dec2rom_dict = {"01": "i", "02": "ii", "03": "iii", "04": "iv",
"05": "v", "06": "vi", "07": "vii", "08": "viii",
"09": "ix", "10": "x", "11": "xi", "12": "xii"}
weekday_dict = {"Sunday": "Niedziela", "Monday": "Poniedziałek", "Tuesday": "Wtorek",
"Wednesday": "Środa", "Thursday": "Czwartek", "Friday": "Piątek", "Saturday": "Sobota"}
now = datetime.datetime.now()
feeds = []
for i in range(0, self.next_days):
url_date = now + datetime.timedelta(days=i)
url_date_month = url_date.strftime("%m")
url_date_month_roman = dec2rom_dict[url_date_month]
url_date_day = url_date.strftime("%d")
url_date_year = url_date.strftime("%Y")[2:]
url_date_weekday = url_date.strftime("%A")
url_date_weekday_pl = weekday_dict[url_date_weekday]
url = "http://brewiarz.pl/" + url_date_month_roman + "_" + url_date_year + "/" + url_date_day + url_date_month + "/index.php3"
articles = self.parse_pages(url)
if articles:
title = url_date_weekday_pl + " " + url_date_day + "." + url_date_month + "." + url_date_year
feeds.append((title, articles))
else:
sectors = self.get_sectors(url)
for subpage in sectors:
title = url_date_weekday_pl + " " + url_date_day + "." + url_date_month + "." + url_date_year + " - " + subpage.string
url = "http://brewiarz.pl/" + url_date_month_roman + "_" + url_date_year + "/" + url_date_day + url_date_month + "/" + subpage['href']
print(url)
articles = self.parse_pages(url)
if articles:
feeds.append((title, articles))
return feeds
def get_sectors(self, url):
sectors = []
soup = self.index_to_soup(url)
sectors_table = soup.find(name='table', attrs={'width': '490'})
sector_links = sectors_table.findAll(name='a')
for sector_links_modified in sector_links:
link_parent_text = sector_links_modified.findParent(name='div').text
if link_parent_text:
sector_links_modified.text = link_parent_text.text
sectors.append(sector_links_modified)
return sectors
def parse_pages(self, url):
current_articles = []
soup = self.index_to_soup(url)
www = soup.find(attrs={'class': 'www'})
if www:
box_title = www.find(text='Teksty LG')
article_box_parent = box_title.findParent('ul')
article_box_sibling = article_box_parent.findNextSibling('ul')
for li in article_box_sibling.findAll('li'):
link = li.find(name='a')
ol = link.findNextSibling(name='ol')
if ol:
sublinks = ol.findAll(name='a')
for sublink in sublinks:
link_title = self.tag_to_string(link) + " - " + self.tag_to_string(sublink)
link_url_print = re.sub('php3', 'php3?kr=_druk&wr=lg&', sublink['href'])
link_url = url[:-10] + link_url_print
current_articles.append({'title': link_title,
'url': link_url, 'description': '', 'date': ''})
else:
if link.findParent(name = 'ol'):
continue
else:
link_title = self.tag_to_string(link)
link_url_print = re.sub('php3', 'php3?kr=_druk&wr=lg&', link['href'])
link_url = url[:-10] + link_url_print
current_articles.append({'title': link_title,
'url': link_url, 'description': '', 'date': ''})
return current_articles
else:
return None
def preprocess_html(self, soup):
footer = soup.find(name='a', attrs={'href': 'http://brewiarz.pl'})
footer_parent = footer.findParent('div')
footer_parent.extract()
header = soup.find(text='http://brewiarz.pl')
header_parent = header.findParent('div')
header_parent.extract()
subheader = soup.find(text='Kolor szat:').findParent('div')
subheader.extract()
color = soup.find('b')
color.extract()
cleaned = self.strip_tags(soup)
div = cleaned.findAll(name='div')
div[1].extract()
div[2].extract()
div[3].extract()
return cleaned
def strip_tags(self, soup_dirty):
VALID_TAGS = ['p', 'div', 'br', 'b', 'a', 'title', 'head', 'html', 'body']
for tag in soup_dirty.findAll(True):
if tag.name not in VALID_TAGS:
for i, x in enumerate(tag.parent.contents):
if x == tag:
break
else:
print "Can't find", tag, "in", tag.parent
continue
for r in reversed(tag.contents):
tag.parent.insert(i, r)
tag.extract()
return soup_dirty

View File

@ -6,7 +6,6 @@ class Dobreprogramy_pl(BasicNewsRecipe):
__author__ = 'fenuks'
__licence__ ='GPL v3'
category = 'IT'
language = 'pl'
masthead_url='http://static.dpcdn.pl/css/Black/Images/header_logo_napis_fullVersion.png'
cover_url = 'http://userlogos.org/files/logos/Karmody/dobreprogramy_01.png'
description = u'Aktualności i blogi z dobreprogramy.pl'
@ -29,4 +28,4 @@ class Dobreprogramy_pl(BasicNewsRecipe):
for a in soup('a'):
if a.has_key('href') and 'http://' not in a['href'] and 'https://' not in a['href']:
a['href']=self.index + a['href']
return soup
return soup

View File

@ -2,7 +2,9 @@ import re
from calibre.web.feeds.news import BasicNewsRecipe
class FocusRecipe(BasicNewsRecipe):
__license__ = 'GPL v3'
__author__ = u'intromatyk <intromatyk@gmail.com>'
language = 'pl'
@ -12,10 +14,10 @@ class FocusRecipe(BasicNewsRecipe):
publisher = u'Gruner + Jahr Polska'
category = u'News'
description = u'Newspaper'
category='magazine'
cover_url=''
remove_empty_feeds= True
no_stylesheets=True
category = 'magazine'
cover_url = ''
remove_empty_feeds = True
no_stylesheets = True
oldest_article = 7
max_articles_per_feed = 100000
recursions = 0
@ -27,15 +29,15 @@ class FocusRecipe(BasicNewsRecipe):
simultaneous_downloads = 5
r = re.compile('.*(?P<url>http:\/\/(www.focus.pl)|(rss.feedsportal.com\/c)\/.*\.html?).*')
keep_only_tags =[]
keep_only_tags.append(dict(name = 'div', attrs = {'id' : 'cll'}))
remove_tags =[]
remove_tags.append(dict(name = 'div', attrs = {'class' : 'ulm noprint'}))
remove_tags.append(dict(name = 'div', attrs = {'class' : 'txb'}))
remove_tags.append(dict(name = 'div', attrs = {'class' : 'h2'}))
remove_tags.append(dict(name = 'ul', attrs = {'class' : 'txu'}))
remove_tags.append(dict(name = 'div', attrs = {'class' : 'ulc'}))
keep_only_tags = []
keep_only_tags.append(dict(name='div', attrs={'id': 'cll'}))
remove_tags = []
remove_tags.append(dict(name='div', attrs={'class': 'ulm noprint'}))
remove_tags.append(dict(name='div', attrs={'class': 'txb'}))
remove_tags.append(dict(name='div', attrs={'class': 'h2'}))
remove_tags.append(dict(name='ul', attrs={'class': 'txu'}))
remove_tags.append(dict(name='div', attrs={'class': 'ulc'}))
extra_css = '''
body {font-family: verdana, arial, helvetica, geneva, sans-serif ;}
@ -44,18 +46,17 @@ class FocusRecipe(BasicNewsRecipe):
p.lead {font-weight: bold; text-align: left;}
.authordate {font-size: small; color: #696969;}
.fot{font-size: x-small; color: #666666;}
'''
'''
feeds = [
('Nauka', 'http://focus.pl.feedsportal.com/c/32992/f/532693/index.rss'),
('Historia', 'http://focus.pl.feedsportal.com/c/32992/f/532694/index.rss'),
('Cywilizacja', 'http://focus.pl.feedsportal.com/c/32992/f/532695/index.rss'),
('Sport', 'http://focus.pl.feedsportal.com/c/32992/f/532696/index.rss'),
('Technika', 'http://focus.pl.feedsportal.com/c/32992/f/532697/index.rss'),
('Przyroda', 'http://focus.pl.feedsportal.com/c/32992/f/532698/index.rss'),
('Technologie', 'http://focus.pl.feedsportal.com/c/32992/f/532699/index.rss'),
]
feeds = [
('Nauka', 'http://www.focus.pl/nauka/rss/'),
('Historia', 'http://www.focus.pl/historia/rss/'),
('Cywilizacja', 'http://www.focus.pl/cywilizacja/rss/'),
('Sport', 'http://www.focus.pl/sport/rss/'),
('Technika', 'http://www.focus.pl/technika/rss/'),
('Przyroda', 'http://www.focus.pl/przyroda/rss/'),
('Technologie', 'http://www.focus.pl/gadzety/rss/')
]
def skip_ad_pages(self, soup):
if ('advertisement' in soup.find('title').string.lower()):
@ -65,20 +66,20 @@ class FocusRecipe(BasicNewsRecipe):
return None
def get_cover_url(self):
soup=self.index_to_soup('http://www.focus.pl/magazyn/')
tag=soup.find(name='div', attrs={'class':'clr fl'})
soup = self.index_to_soup('http://www.focus.pl/magazyn/')
tag = soup.find(name='div', attrs={'class': 'clr fl'})
if tag:
self.cover_url='http://www.focus.pl/' + tag.a['href']
self.cover_url = 'http://www.focus.pl/' + tag.a['href']
return getattr(self, 'cover_url', self.cover_url)
def print_version(self, url):
if url.count ('focus.pl.feedsportal.com'):
if url.count('focus.pl.feedsportal.com'):
u = url.find('focus0Bpl')
u = 'http://www.focus.pl/' + url[u + 11:]
u = u.replace('0C', '/')
u = u.replace('A', '')
u = u.replace ('0E','-')
u = u.replace('0E', '-')
u = u.replace('/nc/1//story01.htm', '/do-druku/1')
else:
u = url.replace('/nc/1','/do-druku/1')
return u
else:
u = url.replace('/nc/1', '/do-druku/1')
return u

View File

@ -0,0 +1,103 @@
#!/usr/bin/env python
__license__ = 'GPL v3'
__copyright__ = 'teepel <teepel44@gmail.com> based on GW from fenuks'
'''
krakow.gazeta.pl
'''
from calibre.web.feeds.news import BasicNewsRecipe
import re
class gw_krakow(BasicNewsRecipe):
title = u'Gazeta.pl Kraków'
__author__ = 'teepel <teepel44@gmail.com> based on GW from fenuks'
language = 'pl'
description =u'Wiadomości z Krakowa na portalu Gazeta.pl.'
category='newspaper'
publication_type = 'newspaper'
masthead_url='http://bi.gazeta.pl/im/5/8528/m8528105.gif'
INDEX='http://krakow.gazeta.pl/'
remove_empty_feeds= True
oldest_article = 1
max_articles_per_feed = 100
remove_javascript=True
no_stylesheets=True
keep_only_tags =[]
keep_only_tags.append(dict(name = 'div', attrs = {'id' : 'gazeta_article'}))
remove_tags =[]
remove_tags.append(dict(name = 'div', attrs = {'id' : 'gazeta_article_likes'}))
remove_tags.append(dict(name = 'div', attrs = {'id' : 'gazeta_article_tools'}))
remove_tags.append(dict(name = 'div', attrs = {'id' : 'rel'}))
remove_tags.append(dict(name = 'div', attrs = {'id' : 'gazeta_article_share'}))
remove_tags.append(dict(name = 'u1', attrs = {'id' : 'articleToolbar'}))
remove_tags.append(dict(name = 'li', attrs = {'class' : 'atComments'}))
remove_tags.append(dict(name = 'li', attrs = {'class' : 'atLicense'}))
remove_tags.append(dict(name = 'div', attrs = {'id' : 'banP4'}))
remove_tags.append(dict(name = 'div', attrs = {'id' : 'article_toolbar'}))
remove_tags.append(dict(name = 'div', attrs = {'id' : 'gazeta_article_tags'}))
remove_tags.append(dict(name = 'p', attrs = {'class' : 'txt_upl'}))
remove_tags.append(dict(name = 'div', attrs = {'class' : 'gazeta_article_related_new'}))
remove_tags.append(dict(name = 'div', attrs = {'class' : 'gazetaVideoPlayer'}))
remove_tags.append(dict(name = 'div', attrs = {'id' : 'gazeta_article_miniatures'}))
remove_tags.append(dict(name = 'div', attrs = {'id' : 'gazeta_article_buttons'}))
remove_tags_after = [dict(name = 'div', attrs = {'id' : 'gazeta_article_share'})]
feeds = [(u'Wiadomości', u'http://rss.gazeta.pl/pub/rss/krakow.xml')]
def skip_ad_pages(self, soup):
tag=soup.find(name='a', attrs={'class':'btn'})
if tag:
new_soup=self.index_to_soup(tag['href'], raw=True)
return new_soup
def append_page(self, soup, appendtag):
loop=False
tag = soup.find('div', attrs={'id':'Str'})
if appendtag.find('div', attrs={'id':'Str'}):
nexturl=tag.findAll('a')
appendtag.find('div', attrs={'id':'Str'}).extract()
loop=True
if appendtag.find(id='source'):
appendtag.find(id='source').extract()
while loop:
loop=False
for link in nexturl:
if u'następne' in link.string:
url= self.INDEX + link['href']
soup2 = self.index_to_soup(url)
pagetext = soup2.find(id='artykul')
pos = len(appendtag.contents)
appendtag.insert(pos, pagetext)
tag = soup2.find('div', attrs={'id':'Str'})
nexturl=tag.findAll('a')
loop=True
def gallery_article(self, appendtag):
tag=appendtag.find(id='container_gal')
if tag:
nexturl=appendtag.find(id='gal_btn_next').a['href']
appendtag.find(id='gal_navi').extract()
while nexturl:
soup2=self.index_to_soup(nexturl)
pagetext=soup2.find(id='container_gal')
nexturl=pagetext.find(id='gal_btn_next')
if nexturl:
nexturl=nexturl.a['href']
pos = len(appendtag.contents)
appendtag.insert(pos, pagetext)
rem=appendtag.find(id='gal_navi')
if rem:
rem.extract()
def preprocess_html(self, soup):
self.append_page(soup, soup.body)
if soup.find(id='container_gal'):
self.gallery_article(soup.body)
return soup

View File

@ -0,0 +1,100 @@
#!/usr/bin/env python
__license__ = 'GPL v3'
__author__ = 'teepel <teepel44@gmail.com> based on GW from fenuks'
'''
warszawa.gazeta.pl
'''
from calibre.web.feeds.news import BasicNewsRecipe
import re
class gw_wawa(BasicNewsRecipe):
title = u'Gazeta.pl Warszawa'
__author__ = 'teepel <teepel44@gmail.com> based on GW from fenuks'
language = 'pl'
description ='Wiadomości z Warszawy na portalu Gazeta.pl.'
category='newspaper'
publication_type = 'newspaper'
masthead_url='http://bi.gazeta.pl/im/3/4089/m4089863.gif'
INDEX='http://warszawa.gazeta.pl/'
remove_empty_feeds= True
oldest_article = 1
max_articles_per_feed = 100
remove_javascript=True
no_stylesheets=True
keep_only_tags =[]
keep_only_tags.append(dict(name = 'div', attrs = {'id' : 'gazeta_article'}))
remove_tags =[]
remove_tags.append(dict(name = 'div', attrs = {'id' : 'gazeta_article_likes'}))
remove_tags.append(dict(name = 'div', attrs = {'id' : 'gazeta_article_tools'}))
remove_tags.append(dict(name = 'div', attrs = {'id' : 'rel'}))
remove_tags.append(dict(name = 'div', attrs = {'id' : 'gazeta_article_share'}))
remove_tags.append(dict(name = 'u1', attrs = {'id' : 'articleToolbar'}))
remove_tags.append(dict(name = 'li', attrs = {'class' : 'atComments'}))
remove_tags.append(dict(name = 'li', attrs = {'class' : 'atLicense'}))
remove_tags.append(dict(name = 'div', attrs = {'id' : 'banP4'}))
remove_tags.append(dict(name = 'div', attrs = {'id' : 'article_toolbar'}))
remove_tags.append(dict(name = 'div', attrs = {'id' : 'gazeta_article_tags'}))
remove_tags.append(dict(name = 'p', attrs = {'class' : 'txt_upl'}))
remove_tags.append(dict(name = 'div', attrs = {'class' : 'gazeta_article_related_new'}))
remove_tags.append(dict(name = 'div', attrs = {'class' : 'gazetaVideoPlayer'}))
remove_tags.append(dict(name = 'div', attrs = {'id' : 'gazeta_article_miniatures'}))
feeds = [(u'Wiadomości', u'http://rss.gazeta.pl/pub/rss/warszawa.xml')]
def skip_ad_pages(self, soup):
tag=soup.find(name='a', attrs={'class':'btn'})
if tag:
new_soup=self.index_to_soup(tag['href'], raw=True)
return new_soup
def append_page(self, soup, appendtag):
loop=False
tag = soup.find('div', attrs={'id':'Str'})
if appendtag.find('div', attrs={'id':'Str'}):
nexturl=tag.findAll('a')
appendtag.find('div', attrs={'id':'Str'}).extract()
loop=True
if appendtag.find(id='source'):
appendtag.find(id='source').extract()
while loop:
loop=False
for link in nexturl:
if u'następne' in link.string:
url= self.INDEX + link['href']
soup2 = self.index_to_soup(url)
pagetext = soup2.find(id='artykul')
pos = len(appendtag.contents)
appendtag.insert(pos, pagetext)
tag = soup2.find('div', attrs={'id':'Str'})
nexturl=tag.findAll('a')
loop=True
def gallery_article(self, appendtag):
tag=appendtag.find(id='container_gal')
if tag:
nexturl=appendtag.find(id='gal_btn_next').a['href']
appendtag.find(id='gal_navi').extract()
while nexturl:
soup2=self.index_to_soup(nexturl)
pagetext=soup2.find(id='container_gal')
nexturl=pagetext.find(id='gal_btn_next')
if nexturl:
nexturl=nexturl.a['href']
pos = len(appendtag.contents)
appendtag.insert(pos, pagetext)
rem=appendtag.find(id='gal_navi')
if rem:
rem.extract()
def preprocess_html(self, soup):
self.append_page(soup, soup.body)
if soup.find(id='container_gal'):
self.gallery_article(soup.body)
return soup

View File

@ -1,104 +1,107 @@
# -*- coding: utf-8 -*-
from calibre.web.feeds.news import BasicNewsRecipe
class Gazeta_Wyborcza(BasicNewsRecipe):
title = u'Gazeta Wyborcza'
__author__ = 'fenuks'
language = 'pl'
description ='news from gazeta.pl'
category='newspaper'
title = u'Gazeta.pl'
__author__ = 'fenuks, Artur Stachecki'
language = 'pl'
description = 'news from gazeta.pl'
category = 'newspaper'
publication_type = 'newspaper'
masthead_url='http://bi.gazeta.pl/im/5/10285/z10285445AA.jpg'
INDEX='http://wyborcza.pl'
remove_empty_feeds= True
masthead_url = 'http://bi.gazeta.pl/im/5/10285/z10285445AA.jpg'
INDEX = 'http://wyborcza.pl'
remove_empty_feeds = True
oldest_article = 3
max_articles_per_feed = 100
remove_javascript=True
no_stylesheets=True
ignore_duplicate_articles = {'title', 'url'}
keep_only_tags = dict(id=['gazeta_article', 'article'])
remove_tags_after = dict(id='gazeta_article_share')
remove_tags = [dict(attrs={'class':['artReadMore', 'gazeta_article_related_new', 'txt_upl']}), dict(id=['gazeta_article_likes', 'gazeta_article_tools', 'rel', 'gazeta_article_tags', 'gazeta_article_share', 'gazeta_article_brand', 'gazeta_article_miniatures'])]
feeds = [(u'Kraj', u'http://rss.feedsportal.com/c/32739/f/530266/index.rss'), (u'\u015awiat', u'http://rss.feedsportal.com/c/32739/f/530270/index.rss'),
(u'Wyborcza.biz', u'http://wyborcza.biz/pub/rss/wyborcza_biz_wiadomosci.htm'),
(u'Komentarze', u'http://rss.feedsportal.com/c/32739/f/530312/index.rss'),
(u'Kultura', u'http://rss.gazeta.pl/pub/rss/gazetawyborcza_kultura.xml'),
(u'Nauka', u'http://rss.feedsportal.com/c/32739/f/530269/index.rss'),
(u'Opinie', u'http://rss.gazeta.pl/pub/rss/opinie.xml'),
(u'Gazeta \u015awi\u0105teczna', u'http://rss.feedsportal.com/c/32739/f/530431/index.rss'),
#(u'Du\u017cy Format', u'http://rss.feedsportal.com/c/32739/f/530265/index.rss'),
(u'Witamy w Polsce', u'http://rss.feedsportal.com/c/32739/f/530476/index.rss'),
(u'M\u0119ska Muzyka', u'http://rss.feedsportal.com/c/32739/f/530337/index.rss'),
(u'Lata Lec\u0105', u'http://rss.feedsportal.com/c/32739/f/530326/index.rss'),
(u'Solidarni z Tybetem', u'http://rss.feedsportal.com/c/32739/f/530461/index.rss'),
(u'W pon. - \u017bakowski', u'http://rss.feedsportal.com/c/32739/f/530491/index.rss'),
(u'We wt. - Kolenda-Zalewska', u'http://rss.feedsportal.com/c/32739/f/530310/index.rss'),
(u'\u015aroda w \u015brod\u0119', u'http://rss.feedsportal.com/c/32739/f/530428/index.rss'),
(u'W pi\u0105tek - Olejnik', u'http://rss.feedsportal.com/c/32739/f/530364/index.rss')
]
remove_javascript = True
no_stylesheets = True
remove_tags_before = dict(id='k0')
remove_tags_after = dict(id='banP4')
remove_tags = [dict(name='div', attrs={'class':'rel_box'}), dict(attrs={'class':['date', 'zdjP', 'zdjM', 'pollCont', 'rel_video', 'brand', 'txt_upl']}), dict(name='div', attrs={'id':'footer'})]
feeds = [(u'Kraj', u'http://rss.feedsportal.com/c/32739/f/530266/index.rss'), (u'\u015awiat', u'http://rss.feedsportal.com/c/32739/f/530270/index.rss'),
(u'Wyborcza.biz', u'http://wyborcza.biz/pub/rss/wyborcza_biz_wiadomosci.htm'),
(u'Komentarze', u'http://rss.feedsportal.com/c/32739/f/530312/index.rss'),
(u'Kultura', u'http://rss.gazeta.pl/pub/rss/gazetawyborcza_kultura.xml'),
(u'Nauka', u'http://rss.feedsportal.com/c/32739/f/530269/index.rss'), (u'Opinie', u'http://rss.gazeta.pl/pub/rss/opinie.xml'), (u'Gazeta \u015awi\u0105teczna', u'http://rss.feedsportal.com/c/32739/f/530431/index.rss'), (u'Du\u017cy Format', u'http://rss.feedsportal.com/c/32739/f/530265/index.rss'), (u'Witamy w Polsce', u'http://rss.feedsportal.com/c/32739/f/530476/index.rss'), (u'M\u0119ska Muzyka', u'http://rss.feedsportal.com/c/32739/f/530337/index.rss'), (u'Lata Lec\u0105', u'http://rss.feedsportal.com/c/32739/f/530326/index.rss'), (u'Solidarni z Tybetem', u'http://rss.feedsportal.com/c/32739/f/530461/index.rss'), (u'W pon. - \u017bakowski', u'http://rss.feedsportal.com/c/32739/f/530491/index.rss'), (u'We wt. - Kolenda-Zalewska', u'http://rss.feedsportal.com/c/32739/f/530310/index.rss'), (u'\u015aroda w \u015brod\u0119', u'http://rss.feedsportal.com/c/32739/f/530428/index.rss'), (u'W pi\u0105tek - Olejnik', u'http://rss.feedsportal.com/c/32739/f/530364/index.rss'), (u'Nekrologi', u'http://rss.feedsportal.com/c/32739/f/530358/index.rss')
]
def skip_ad_pages(self, soup):
tag=soup.find(name='a', attrs={'class':'btn'})
if tag:
new_soup=self.index_to_soup(tag['href'], raw=True)
tag = soup.find(name='a', attrs={'class': 'btn'})
if tag:
new_soup = self.index_to_soup(tag['href'], raw=True)
return new_soup
def append_page(self, soup, appendtag):
loop=False
tag = soup.find('div', attrs={'id':'Str'})
if appendtag.find('div', attrs={'id':'Str'}):
nexturl=tag.findAll('a')
appendtag.find('div', attrs={'id':'Str'}).extract()
loop=True
loop = False
tag = soup.find('div', attrs={'id': 'Str'})
if appendtag.find('div', attrs={'id': 'Str'}):
nexturl = tag.findAll('a')
appendtag.find('div', attrs={'id': 'Str'}).extract()
loop = True
if appendtag.find(id='source'):
appendtag.find(id='source').extract()
while loop:
loop=False
loop = False
for link in nexturl:
if u'następne' in link.string:
url= self.INDEX + link['href']
url = self.INDEX + link['href']
soup2 = self.index_to_soup(url)
pagetext = soup2.find(id='artykul')
pos = len(appendtag.contents)
appendtag.insert(pos, pagetext)
tag = soup2.find('div', attrs={'id':'Str'})
nexturl=tag.findAll('a')
loop=True
tag = soup2.find('div', attrs={'id': 'Str'})
nexturl = tag.findAll('a')
loop = True
def gallery_article(self, appendtag):
tag=appendtag.find(id='container_gal')
tag = appendtag.find(id='container_gal')
if tag:
nexturl=appendtag.find(id='gal_btn_next').a['href']
nexturl = appendtag.find(id='gal_btn_next').a['href']
appendtag.find(id='gal_navi').extract()
while nexturl:
soup2=self.index_to_soup(nexturl)
pagetext=soup2.find(id='container_gal')
nexturl=pagetext.find(id='gal_btn_next')
soup2 = self.index_to_soup(nexturl)
pagetext = soup2.find(id='container_gal')
nexturl = pagetext.find(id='gal_btn_next')
if nexturl:
nexturl=nexturl.a['href']
nexturl = nexturl.a['href']
pos = len(appendtag.contents)
appendtag.insert(pos, pagetext)
rem=appendtag.find(id='gal_navi')
rem = appendtag.find(id='gal_navi')
if rem:
rem.extract()
def preprocess_html(self, soup):
self.append_page(soup, soup.body)
if soup.find(id='container_gal'):
self.gallery_article(soup.body)
return soup
if soup.find(attrs={'class': 'piano_btn_1'}):
return None
else:
self.append_page(soup, soup.body)
if soup.find(id='container_gal'):
self.gallery_article(soup.body)
return soup
def print_version(self, url):
if 'http://wyborcza.biz/biznes/' not in url:
return url
if url.count('rss.feedsportal.com'):
u = url.find('wyborcza0Bpl')
u = 'http://www.wyborcza.pl/' + url[u + 11:]
u = u.replace('0C', '/')
u = u.replace('A', '')
u = u.replace('0E', '-')
u = u.replace('0H', ',')
u = u.replace('0I', '_')
u = u.replace('0B', '.')
u = u.replace('/1,', '/2029020,')
u = u.replace('/story01.htm', '')
print(u)
return u
elif 'http://wyborcza.pl/1' in url:
return url.replace('http://wyborcza.pl/1', 'http://wyborcza.pl/2029020')
else:
return url.replace('http://wyborcza.biz/biznes/1', 'http://wyborcza.biz/biznes/2029020')
return url.replace('http://wyborcza.biz/biznes/1', 'http://wyborcza.biz/biznes/2029020')
def get_cover_url(self):
soup = self.index_to_soup('http://wyborcza.pl/0,76762,3751429.html')
cover=soup.find(id='GWmini2')
soup = self.index_to_soup('http://wyborcza.pl/'+ cover.contents[3].a['href'])
self.cover_url='http://wyborcza.pl' + soup.img['src']
cover = soup.find(id='GWmini2')
soup = self.index_to_soup('http://wyborcza.pl/' + cover.contents[3].a['href'])
self.cover_url = 'http://wyborcza.pl' + soup.img['src']
return getattr(self, 'cover_url', self.cover_url)

BIN
recipes/icons/autosport.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 415 B

BIN
recipes/icons/blognexto.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 699 B

BIN
recipes/icons/brewiarz.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 982 B

Binary file not shown.

After

Width:  |  Height:  |  Size: 802 B

Binary file not shown.

After

Width:  |  Height:  |  Size: 802 B

Binary file not shown.

After

Width:  |  Height:  |  Size: 802 B

Binary file not shown.

Before

Width:  |  Height:  |  Size: 221 B

After

Width:  |  Height:  |  Size: 802 B

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.1 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 698 B

BIN
recipes/icons/pravda_en.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 538 B

BIN
recipes/icons/pravda_it.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 538 B

Binary file not shown.

After

Width:  |  Height:  |  Size: 538 B

BIN
recipes/icons/pravda_ru.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 538 B

Binary file not shown.

After

Width:  |  Height:  |  Size: 965 B

Binary file not shown.

After

Width:  |  Height:  |  Size: 820 B

Binary file not shown.

After

Width:  |  Height:  |  Size: 330 B

BIN
recipes/icons/satkurier.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.2 KiB

BIN
recipes/icons/wprost.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.7 KiB

34
recipes/kerrang.recipe Normal file
View File

@ -0,0 +1,34 @@
#!/usr/bin/env python
__license__ = 'GPL v3'
from calibre.web.feeds.news import BasicNewsRecipe
class kerrang(BasicNewsRecipe):
title = u'Kerrang!'
__author__ = 'Artur Stachecki <artur.stachecki@gmail.com>'
language = 'en_GB'
description = u'UK-based magazine devoted to rock music published by Bauer Media Group'
oldest_article = 7
masthead_url = 'http://images.kerrang.com/design/kerrang/kerrangsite/logo.gif'
max_articles_per_feed = 100
simultaneous_downloads = 5
remove_javascript = True
no_stylesheets = True
use_embedded_content = False
recursions = 0
keep_only_tags = []
keep_only_tags.append(dict(attrs = {'class' : ['headz', 'blktxt']}))
extra_css = ''' img { display: block; margin-right: auto;}
h1 {text-align: left; font-size: 22px;}'''
feeds = [(u'News', u'http://www.kerrang.com/blog/rss.xml')]
def preprocess_html(self, soup):
for alink in soup.findAll('a'):
if alink.string is not None:
tstr = alink.string
alink.replaceWith(tstr)
return soup

45
recipes/lequipe.recipe Normal file
View File

@ -0,0 +1,45 @@
#!/usr/bin/env python
__license__ = 'GPL v3'
from calibre.web.feeds.news import BasicNewsRecipe
class leequipe(BasicNewsRecipe):
title = u'l\'equipe'
__author__ = 'Artur Stachecki <artur.stachecki@gmail.com>'
language = 'fr'
description = u'Retrouvez tout le sport en direct sur le site de L\'EQUIPE et suivez l\'actualité du football, rugby, basket, cyclisme, f1, volley, hand, tous les résultats sportifs'
oldest_article = 1
masthead_url = 'http://static.lequipe.fr/v6/img/logo-lequipe.png'
max_articles_per_feed = 100
simultaneous_downloads = 5
remove_javascript = True
no_stylesheets = True
use_embedded_content = False
recursions = 0
keep_only_tags = []
keep_only_tags.append(dict(attrs={'id': ['article']}))
remove_tags = []
remove_tags.append(dict(attrs={'id': ['partage', 'ensavoirplus', 'bloc_bas_breve', 'commentaires', 'tools']}))
remove_tags.append(dict(attrs={'class': ['partage_bis', 'date']}))
feeds = [(u'Football', u'http://www.lequipe.fr/rss/actu_rss_Football.xml'),
(u'Auto-Moto', u'http://www.lequipe.fr/rss/actu_rss_Auto-Moto.xml'),
(u'Tennis', u'http://www.lequipe.fr/rss/actu_rss_Tennis.xml'),
(u'Golf', u'http://www.lequipe.fr/rss/actu_rss_Golf.xml'),
(u'Rugby', u'http://www.lequipe.fr/rss/actu_rss_Rugby.xml'),
(u'Basket', u'http://www.lequipe.fr/rss/actu_rss_Basket.xml'),
(u'Hand', u'http://www.lequipe.fr/rss/actu_rss_Hand.xml'),
(u'Cyclisme', u'http://www.lequipe.fr/rss/actu_rss_Cyclisme.xml'),
(u'Autres Sports', u'http://pipes.yahoo.com/pipes/pipe.run?_id=2039f7f4f350c70c5e4e8633aa1b37cd&_render=rss')
]
def preprocess_html(self, soup):
for alink in soup.findAll('a'):
if alink.string is not None:
tstr = alink.string
alink.replaceWith(tstr)
return soup

View File

@ -40,6 +40,6 @@ class LondonReviewOfBooks(BasicNewsRecipe):
soup = self.index_to_soup('http://www.lrb.co.uk/')
cover_item = soup.find('p',attrs={'class':'cover'})
if cover_item:
cover_url = 'http://www.lrb.co.uk' + cover_item.a.img['src']
cover_url = cover_item.a.img['src']
return cover_url

View File

@ -0,0 +1,36 @@
#!/usr/bin/env python
__license__ = 'GPL v3'
__author__ = 'teepel <teepel44@gmail.com>'
'''
http://www.mateusz.pl/czytania
'''
from calibre.web.feeds.news import BasicNewsRecipe
class czytania_mateusz(BasicNewsRecipe):
title = u'Czytania na ka\u017cdy dzie\u0144'
__author__ = 'teepel <teepel44@gmail.com>'
description = u'Codzienne czytania z jednego z najstarszych polskich serwisów katolickich.'
language = 'pl'
INDEX='http://www.mateusz.pl/czytania'
oldest_article = 1
remove_empty_feeds= True
no_stylesheets=True
auto_cleanup = True
remove_javascript = True
simultaneous_downloads = 2
max_articles_per_feed = 100
auto_cleanup = True
feeds = [(u'Czytania', u'http://mateusz.pl/rss/czytania/')]
remove_tags =[]
remove_tags.append(dict(name = 'p', attrs = {'class' : 'top'}))
#thanks t3d
def get_article_url(self, article):
link = article.get('link')
if 'kmt.pl' not in link:
return link

View File

@ -4,7 +4,7 @@ from calibre.web.feeds.news import BasicNewsRecipe
class FocusRecipe(BasicNewsRecipe):
__license__ = 'GPL v3'
__author__ = u'intromatyk <intromatyk@gmail.com>'
__author__ = u'Artur Stachecki <artur.stachecki@gmail.com>'
language = 'pl'
version = 1

View File

@ -0,0 +1,61 @@
#!/usr/bin/env python
__license__ = 'GPL v3'
from calibre.web.feeds.news import BasicNewsRecipe
class naszdziennik(BasicNewsRecipe):
title = u'Nasz Dziennik'
__author__ = 'Artur Stachecki <artur.stachecki@gmail.com>'
language = 'pl'
description =u'Nasz Dziennik - Ogólnopolska gazeta codzienna. Podejmuje tematykę dotyczącą życia społecznego, kulturalnego, politycznego i religijnego. Propaguje wartości chrześcijańskie oraz tradycję i kulturę polską.'
masthead_url='http://www.naszdziennik.pl/images/logo-male.png'
max_articles_per_feed = 100
remove_javascript=True
no_stylesheets = True
keep_only_tags =[dict(attrs = {'id' : 'article'})]
#definiujemy nową funkcje; musi zwracać listę feedów wraz z artykułami
def parse_index(self):
#adres do parsowania artykułów
soup = self.index_to_soup('http://www.naszdziennik.pl/news')
#deklaracja pustej listy feedów
feeds = []
#deklaracja pustego słownika artykułów
articles = {}
#deklaracja pustej listy sekcji
sections = []
#deklaracja pierwszej sekcji jako pusty string
section = ''
#pętla for, która analizuje po kolei każdy tag "news-article"
for item in soup.findAll(attrs = {'class' : 'news-article'}) :
#w tagu "news-article szukamy pierwszego taga h4"
section = item.find('h4')
#zmiennej sekcja przypisujemy zawartość tekstową taga
section = self.tag_to_string(section)
#sprawdzamy czy w słowniku artykułów istnieje klucz dotyczący sekcji
#jeśli nie istnieje to :
if not articles.has_key(section) :
#do listy sekcji dodajemy nową sekcje
sections.append(section)
#deklarujemy nową sekcje w słowniku artykułów przypisując jej klucz odpowiadający nowej sekcji, którego wartością jest pusta lista
articles[section] = []
#przeszukujemy kolejny tag "title-datetime"
article_title_datetime = item.find(attrs = {'class' : 'title-datetime'})
#w tagu title-datetime znajdujemy pierwszy link
article_a = article_title_datetime.find('a')
#i tworzymy z niego link absolutny do właściwego artykułu
article_url = 'http://naszdziennik.pl' + article_a['href']
#jako tytuł użyty będzie tekst pomiędzy tagami <a>
article_title = self.tag_to_string(article_a)
#a data będzie tekstem z pierwszego taga h4 znalezionego w tagu title-datetime
article_date = self.tag_to_string(article_title_datetime.find('h4'))
#zebrane elementy dodajemy do listy zadeklarowanej w linijce 44
articles[section].append( { 'title' : article_title, 'url' : article_url, 'date' : article_date })
#po dodaniu wszystkich artykułów dodajemy sekcje do listy feedów, korzystając z list sekcji znajdujących się w słowniku
for section in sections:
feeds.append((section, articles[section]))
#zwracamy listę feedów, której parsowaniem zajmie się calibre
return feeds

View File

@ -22,9 +22,9 @@ class NewYorker(BasicNewsRecipe):
masthead_url = 'http://www.newyorker.com/css/i/hed/logo.gif'
extra_css = """
body {font-family: "Times New Roman",Times,serif}
.articleauthor{color: #9F9F9F;
.articleauthor{color: #9F9F9F;
font-family: Arial, sans-serif;
font-size: small;
font-size: small;
text-transform: uppercase}
.rubric,.dd,h6#credit{color: #CD0021;
font-family: Arial, sans-serif;
@ -63,11 +63,11 @@ class NewYorker(BasicNewsRecipe):
return url.strip()
def get_cover_url(self):
cover_url = None
soup = self.index_to_soup('http://www.newyorker.com/magazine/toc/')
cover_item = soup.find('img',attrs={'id':'inThisIssuePhoto'})
cover_url = "http://www.newyorker.com/images/covers/1925/1925_02_21_p233.jpg"
soup = self.index_to_soup('http://www.newyorker.com/magazine?intcid=magazine')
cover_item = soup.find('div',attrs={'id':'media-count-1'})
if cover_item:
cover_url = 'http://www.newyorker.com' + cover_item['src'].strip()
cover_url = 'http://www.newyorker.com' + cover_item.div.img['src'].strip()
return cover_url
def preprocess_html(self, soup):

53
recipes/pravda_en.recipe Normal file
View File

@ -0,0 +1,53 @@
__license__ = 'GPL v3'
__copyright__ = '2012, Darko Miletic <darko.miletic at gmail.com>'
'''
english.pravda.ru
'''
from calibre.web.feeds.news import BasicNewsRecipe
class Pravda_eng(BasicNewsRecipe):
title = 'Pravda in English'
__author__ = 'Darko Miletic'
description = 'News from Russia and rest of the world'
publisher = 'PRAVDA.Ru'
category = 'news, politics, Russia'
oldest_article = 2
max_articles_per_feed = 200
no_stylesheets = True
encoding = 'utf8'
use_embedded_content = False
language = 'en_RU'
remove_empty_feeds = True
publication_type = 'newspaper'
masthead_url = 'http://english.pravda.ru/pix/logo.gif'
extra_css = """
body{font-family: Arial,sans-serif }
img{margin-bottom: 0.4em; display:block}
"""
conversion_options = {
'comment' : description
, 'tags' : category
, 'publisher' : publisher
, 'language' : language
}
remove_attributes=['lang', 'style']
keep_only_tags = [dict(name='div', attrs={'id':'article'})]
feeds = [
(u'World' , u'http://english.pravda.ru/world/export-articles.xml' )
,(u'Russia' , u'http://english.pravda.ru/russia/export-articles.xml' )
,(u'Society' , u'http://english.pravda.ru/society/export-articles.xml' )
,(u'Incidents', u'http://english.pravda.ru/hotspots/export-articles.xml' )
,(u'Opinion' , u'http://english.pravda.ru/opinion/export-articles.xml' )
,(u'Science' , u'http://english.pravda.ru/science/export-articles.xml' )
,(u'Business' , u'http://english.pravda.ru/business/export-articles.xml' )
,(u'Economics', u'http://english.pravda.ru/russia/economics/export-articles.xml')
,(u'Politics' , u'http://english.pravda.ru/russia/politics/export-articles.xml' )
]
def print_version(self, url):
return url + '?mode=print'

52
recipes/pravda_it.recipe Normal file
View File

@ -0,0 +1,52 @@
__license__ = 'GPL v3'
__copyright__ = '2012, Darko Miletic <darko.miletic at gmail.com>'
'''
italia.pravda.ru
'''
from calibre.web.feeds.news import BasicNewsRecipe
class Pravda_ita(BasicNewsRecipe):
title = 'Pravda in Italiano'
__author__ = 'Darko Miletic'
description = 'News from Russia and rest of the world'
publisher = 'PRAVDA.Ru'
category = 'news, politics, Russia'
oldest_article = 2
max_articles_per_feed = 200
no_stylesheets = True
encoding = 'utf8'
use_embedded_content = False
language = 'it'
remove_empty_feeds = True
publication_type = 'newspaper'
masthead_url = 'http://italia.pravda.ru/pix/logo.gif'
extra_css = """
body{font-family: Arial,sans-serif }
img{margin-bottom: 0.4em; display:block}
"""
conversion_options = {
'comment' : description
, 'tags' : category
, 'publisher' : publisher
, 'language' : language
}
remove_attributes=['lang', 'style']
keep_only_tags = [dict(name='div', attrs={'id':'article'})]
feeds = [
(u'Dal mondo' , u'http://italia.pravda.ru/world/export-articles.xml' )
,(u'Russia' , u'http://italia.pravda.ru/russia/export-articles.xml' )
,(u'Societa' , u'http://italia.pravda.ru/society/export-articles.xml' )
,(u'Avvenimenti', u'http://italia.pravda.ru/hotspots/export-articles.xml' )
,(u'Opinioni' , u'http://italia.pravda.ru/opinion/export-articles.xml' )
,(u'Scienza' , u'http://italia.pravda.ru/science/export-articles.xml' )
,(u'Economia' , u'http://italia.pravda.ru/russia/economics/export-articles.xml')
,(u'Politica' , u'http://italia.pravda.ru/russia/politics/export-articles.xml' )
]
def print_version(self, url):
return url + '?mode=print'

51
recipes/pravda_por.recipe Normal file
View File

@ -0,0 +1,51 @@
__license__ = 'GPL v3'
__copyright__ = '2012, Darko Miletic <darko.miletic at gmail.com>'
'''
port.pravda.ru
'''
from calibre.web.feeds.news import BasicNewsRecipe
class Pravda_port(BasicNewsRecipe):
title = u'Pravda em português'
__author__ = 'Darko Miletic'
description = 'News from Russia and rest of the world'
publisher = 'PRAVDA.Ru'
category = 'news, politics, Russia'
oldest_article = 2
max_articles_per_feed = 200
no_stylesheets = True
encoding = 'utf8'
use_embedded_content = False
language = 'pt'
remove_empty_feeds = True
publication_type = 'newspaper'
masthead_url = 'http://port.pravda.ru/pix/logo.gif'
extra_css = """
body{font-family: Arial,sans-serif }
img{margin-bottom: 0.4em; display:block}
"""
conversion_options = {
'comment' : description
, 'tags' : category
, 'publisher' : publisher
, 'language' : language
}
remove_attributes=['lang', 'style']
keep_only_tags = [dict(name='div', attrs={'id':'article'})]
feeds = [
(u'Mundo' , u'http://port.pravda.ru/mundo/export-articles.xml' )
,(u'Russia' , u'http://port.pravda.ru/russa/export-articles.xml' )
,(u'Sociedade' , u'http://port.pravda.ru/sociedade/export-articles.xml' )
,(u'Cultura' , u'http://port.pravda.ru/sociedade/cultura/export-articles.xml')
,(u'Ciencia' , u'http://port.pravda.ru/science/export-articles.xml' )
,(u'Desporto' , u'http://port.pravda.ru/desporto/export-articles.xml' )
,(u'CPLP' , u'http://port.pravda.ru/cplp/export-articles.xml' )
]
def print_version(self, url):
return url + '?mode=print'

50
recipes/pravda_ru.recipe Normal file
View File

@ -0,0 +1,50 @@
__license__ = 'GPL v3'
__copyright__ = '2012, Darko Miletic <darko.miletic at gmail.com>'
'''
www.pravda.ru
'''
from calibre.web.feeds.news import BasicNewsRecipe
class Pravda_ru(BasicNewsRecipe):
title = u'Правда'
__author__ = 'Darko Miletic'
description = u'Правда.Ру: Аналитика и новости'
publisher = 'PRAVDA.Ru'
category = 'news, politics, Russia'
oldest_article = 2
max_articles_per_feed = 200
no_stylesheets = True
encoding = 'utf8'
use_embedded_content = False
language = 'ru'
remove_empty_feeds = True
publication_type = 'newspaper'
masthead_url = 'http://www.pravda.ru/pix/logo.gif'
extra_css = """
body{font-family: Arial,sans-serif }
img{margin-bottom: 0.4em; display:block}
"""
conversion_options = {
'comment' : description
, 'tags' : category
, 'publisher' : publisher
, 'language' : language
}
remove_attributes=['lang', 'style']
keep_only_tags = [dict(name='div', attrs={'id':'article'})]
feeds = [
(u'Мир' , u'http://www.pravda.ru/world/export.xml' )
,(u'Религия' , u'http://www.pravda.ru/faith/export.xml' )
,(u'Общество' , u'http://www.pravda.ru/society/export.xml' )
,(u'Происшествия', u'http://www.pravda.ru/accidents/export.xml')
,(u'Наука' , u'http://www.pravda.ru/science/export.xml' )
,(u'Экономика' , u'http://www.pravda.ru/economics/export.xml')
,(u'Политика' , u'http://www.pravda.ru/politics/export.xml' )
]
def print_version(self, url):
return url + '?mode=print'

View File

@ -0,0 +1,32 @@
import re
from calibre.web.feeds.news import BasicNewsRecipe
class RedVoltaireRecipe(BasicNewsRecipe):
title = u'Red Voltaire'
__author__ = 'atordo'
description = u'Red de prensa no alineada, especializada en el an\u00e1lisis de las relaciones internacionales'
oldest_article = 7
max_articles_per_feed = 30
auto_cleanup = False
no_stylesheets = True
language = 'es'
use_embedded_content = False
remove_javascript = True
cover_url = u'http://www.voltairenet.org/squelettes/elements/images/logo-voltairenet-org.png'
masthead_url = u'http://www.voltairenet.org/squelettes/elements/images/logo-voltairenet-org.png'
preprocess_regexps = [
(re.compile(r'<title>(?P<titulo>.+)</title>.+<span class="updated" title=".+"><time', re.IGNORECASE|re.DOTALL)
,lambda match:'</title></head><body><h1>'+match.group('titulo')+'</h1><time')
,(re.compile(r'<time datetime=.+pubdate>. (?P<fecha>.+)</time>.+<!------------------- COLONNE TEXTE ------------------->', re.IGNORECASE|re.DOTALL)
,lambda match:'<small>'+match.group('fecha')+'</small>')
,(re.compile(r'<aside>.+', re.IGNORECASE|re.DOTALL)
,lambda match:'</body></html>')
]
extra_css = '''
img{margin-bottom:0.4em; display:block; margin-left:auto; margin-right:auto}
'''
feeds = [u'http://www.voltairenet.org/spip.php?page=backend&id_secteur=1110&lang=es']

View File

@ -0,0 +1,28 @@
#!/usr/bin/env python
__license__ = 'GPL v3'
__author__ = 'MrStefan <mrstefaan@gmail.com>'
'''
www.rushisaband.com
'''
from calibre.web.feeds.news import BasicNewsRecipe
class rushisaband(BasicNewsRecipe):
title = u'Rushisaband'
__author__ = 'MrStefan <mrstefaan@gmail.com>'
language = 'en_GB'
description =u'A blog devoted to the band RUSH and its members, Neil Peart, Geddy Lee and Alex Lifeson'
remove_empty_feeds= True
oldest_article = 7
max_articles_per_feed = 100
remove_javascript=True
no_stylesheets=True
keep_only_tags =[]
keep_only_tags.append(dict(name = 'h4'))
keep_only_tags.append(dict(name = 'h5'))
keep_only_tags.append(dict(name = 'p'))
feeds = [(u'Rush is a Band', u'http://feeds2.feedburner.com/rushisaband/blog')]

View File

@ -0,0 +1,41 @@
#!/usr/bin/env python
__license__ = 'GPL v3'
__author__ = 'teepel <teepel44@gmail.com>'
'''
http://www.rynekinfrastruktury.pl
'''
from calibre.web.feeds.news import BasicNewsRecipe
class prawica_recipe(BasicNewsRecipe):
title = u'Rynek Infrastruktury'
__author__ = 'teepel <teepel44@gmail.com>'
language = 'pl'
description =u'Portal "Rynek Infrastruktury" to źródło informacji o kluczowych elementach polskiej gospodarki: drogach, kolei, lotniskach, portach, telekomunikacji, energetyce, prawie i polityce, wzmocnione eksperckimi komentarzami kluczowych analityków.'
remove_empty_feeds= True
oldest_article = 1
max_articles_per_feed = 100
remove_javascript=True
no_stylesheets=True
feeds = [
(u'Drogi', u'http://www.rynekinfrastruktury.pl/rss/41'),
(u'Lotniska', u'http://www.rynekinfrastruktury.pl/rss/42'),
(u'Kolej', u'http://www.rynekinfrastruktury.pl/rss/37'),
(u'Energetyka', u'http://www.rynekinfrastruktury.pl/rss/30'),
(u'Telekomunikacja', u'http://www.rynekinfrastruktury.pl/rss/31'),
(u'Porty', u'http://www.rynekinfrastruktury.pl/rss/32'),
(u'Prawo i polityka', u'http://www.rynekinfrastruktury.pl/rss/47'),
(u'Komentarze', u'http://www.rynekinfrastruktury.pl/rss/38'),
]
keep_only_tags =[]
keep_only_tags.append(dict(name = 'div', attrs = {'class' : 'articleContent'}))
remove_tags =[]
remove_tags.append(dict(name = 'span', attrs = {'class' : 'date'}))
def print_version(self, url):
return url.replace('http://www.rynekinfrastruktury.pl/artykul/', 'http://www.rynekinfrastruktury.pl/artykul/drukuj/')

View File

@ -0,0 +1,40 @@
#!/usr/bin/env python
__license__ = 'GPL v3'
__author__ = 'teepel <teepel44@gmail.com>'
'''
rynek-kolejowy.pl
'''
from calibre.web.feeds.news import BasicNewsRecipe
class rynek_kolejowy(BasicNewsRecipe):
title = u'Rynek Kolejowy'
__author__ = 'teepel <teepel44@gmail.com>'
language = 'pl'
description =u'Rynek Kolejowy - kalendarium wydarzeń branży kolejowej, konferencje, sympozja, targi kolejowe, krajowe i zagraniczne.'
masthead_url='http://p.wnp.pl/images/i/partners/rynek_kolejowy.gif'
remove_empty_feeds= True
oldest_article = 1
max_articles_per_feed = 100
remove_javascript=True
no_stylesheets=True
keep_only_tags =[]
keep_only_tags.append(dict(name = 'div', attrs = {'id' : 'mainContent'}))
remove_tags =[]
remove_tags.append(dict(name = 'div', attrs = {'class' : 'right no-print'}))
remove_tags.append(dict(name = 'div', attrs = {'id' : 'font-size'}))
remove_tags.append(dict(name = 'div', attrs = {'class' : 'no-print'}))
extra_css = '''.wiadomosc_title{ font-size: 1.4em; font-weight: bold; }'''
feeds = [(u'Wiadomości', u'http://www.rynek-kolejowy.pl/rss/rss.php')]
def print_version(self, url):
segment = url.split('/')
urlPart = segment[3]
return 'http://www.rynek-kolejowy.pl/drukuj.php?id=' + urlPart

View File

@ -34,16 +34,20 @@ class RzeczpospolitaRecipe(BasicNewsRecipe):
keep_only_tags.append(dict(name = 'div', attrs = {'id' : 'story'}))
remove_tags =[]
remove_tags.append(dict(name = 'div', attrs = {'class' : 'articleLeftBox'}))
remove_tags.append(dict(name = 'div', attrs = {'class' : 'socialNewTools'}))
remove_tags.append(dict(name = 'div', attrs = {'id' : 'socialTools'}))
remove_tags.append(dict(name = 'div', attrs = {'class' : 'articleToolBoxTop'}))
remove_tags.append(dict(name = 'div', attrs = {'class' : 'clr'}))
remove_tags.append(dict(name = 'div', attrs = {'id' : 'recommendations'}))
remove_tags.append(dict(name = 'div', attrs = {'id' : 'editorPicks'}))
remove_tags.append(dict(name = 'div', attrs = {'class' : 'editorPicks'}))
remove_tags.append(dict(name = 'div', attrs = {'class' : 'editorPicks editorPicksFirst'}))
remove_tags.append(dict(name = 'div', attrs = {'id' : 'articleCopyrightText'}))
remove_tags.append(dict(name = 'div', attrs = {'id' : 'articleCopyrightButton'}))
remove_tags.append(dict(name = 'div', attrs = {'class' : 'articleToolBoxBottom'}))
remove_tags.append(dict(name = 'div', attrs = {'class' : 'more'}))
remove_tags.append(dict(name = 'div', attrs = {'class' : 'addRecommendation'}))
remove_tags.append(dict(name = 'h3', attrs = {'id' : 'tags'}))
extra_css = '''
body {font-family: verdana, arial, helvetica, geneva, sans-serif ;}
@ -67,3 +71,4 @@ class RzeczpospolitaRecipe(BasicNewsRecipe):
return start + '/' + index + '?print=tak'

47
recipes/satkurier.recipe Normal file
View File

@ -0,0 +1,47 @@
#!/usr/bin/env python
__license__ = 'GPL v3'
from calibre.web.feeds.news import BasicNewsRecipe
class SATKurier(BasicNewsRecipe):
title = u'SATKurier.pl'
__author__ = 'Artur Stachecki <artur.stachecki@gmail.com>'
language = 'pl'
description = u'Największy i najstarszy serwis poświęcony\
telewizji cyfrowej, przygotowywany przez wydawcę\
miesięcznika SAT Kurier. Bieżące wydarzenia\
z rynku mediów i nowych technologii.'
oldest_article = 7
masthead_url = 'http://satkurier.pl/img/header_sk_logo.gif'
max_articles_per_feed = 100
simultaneous_downloads = 5
remove_javascript = True
no_stylesheets = True
keep_only_tags = []
keep_only_tags.append(dict(name='div', attrs={'id': ['single_news', 'content']}))
remove_tags = []
remove_tags.append(dict(attrs={'id': ['news_info', 'comments']}))
remove_tags.append(dict(attrs={'href': '#czytaj'}))
remove_tags.append(dict(attrs={'align': 'center'}))
remove_tags.append(dict(attrs={'class': ['date', 'category', 'right mini-add-comment', 'socialLinks', 'commentlist']}))
remove_tags_after = [(dict(id='entry'))]
feeds = [(u'Najnowsze wiadomości', u'http://feeds.feedburner.com/satkurierpl?format=xml'),
(u'Sport w telewizji', u'http://feeds.feedburner.com/satkurier/sport?format=xml'),
(u'Blog', u'http://feeds.feedburner.com/satkurier/blog?format=xml')]
def preprocess_html(self, soup):
image = soup.find(attrs={'id': 'news_mini_photo'})
if image:
image.extract()
header = soup.find('h1')
header.replaceWith(header.prettify() + image.prettify())
for alink in soup.findAll('a'):
if alink.string is not None:
tstr = alink.string
alink.replaceWith(tstr)
return soup

View File

@ -1,34 +1,50 @@
from calibre.web.feeds.news import BasicNewsRecipe
from calibre.utils.magick import Image
class tvn24(BasicNewsRecipe):
title = u'TVN24'
oldest_article = 7
max_articles_per_feed = 100
__author__ = 'fenuks'
__author__ = 'fenuks, Artur Stachecki'
description = u'Sport, Biznes, Gospodarka, Informacje, Wiadomości Zawsze aktualne wiadomości z Polski i ze świata'
category = 'news'
language = 'pl'
#masthead_url= 'http://www.tvn24.pl/_d/topmenu/logo2.gif'
cover_url= 'http://www.userlogos.org/files/logos/Struna/TVN24.jpg'
extra_css = 'ul {list-style:none;} \
li {list-style:none; float: left; margin: 0 0.15em;} \
h2 {font-size: medium} \
.date60m {float: left; margin: 0 10px 0 5px;}'
masthead_url= 'http://www.tvn24.pl/_d/topmenu/logo2.gif'
cover_url= 'http://www.tvn24.pl/_d/topmenu/logo2.gif'
extra_css= 'ul {list-style: none; padding: 0; margin: 0;} li {float: left;margin: 0 0.15em;}'
remove_empty_feeds = True
remove_javascript = True
no_stylesheets = True
use_embedded_content = False
ignore_duplicate_articles = {'title', 'url'}
keep_only_tags=[dict(name='h1', attrs={'class':['size30 mt10 pb10', 'size38 mt10 pb15']}), dict(name='figure', attrs={'class':'articleMainPhoto articleMainPhotoWide'}), dict(name='article', attrs={'class':['mb20', 'mb20 textArticleDefault']}), dict(name='ul', attrs={'class':'newsItem'})]
remove_tags = [dict(name='aside', attrs={'class':['innerArticleModule onRight cols externalContent', 'innerArticleModule center']}), dict(name='div', attrs={'class':['thumbsGallery', 'articleTools', 'article right rd7', 'heading', 'quizContent']}), dict(name='a', attrs={'class':'watchMaterial text'}), dict(name='section', attrs={'class':['quiz toCenter', 'quiz toRight']})]
feeds = [(u'Najnowsze', u'http://www.tvn24.pl/najnowsze.xml'),
(u'Polska', u'www.tvn24.pl/polska.xml'), (u'\u015awiat', u'http://www.tvn24.pl/swiat.xml'), (u'Sport', u'http://www.tvn24.pl/sport.xml'), (u'Biznes', u'http://www.tvn24.pl/biznes.xml'), (u'Meteo', u'http://www.tvn24.pl/meteo.xml'), (u'Micha\u0142ki', u'http://www.tvn24.pl/michalki.xml'), (u'Kultura', u'http://www.tvn24.pl/kultura.xml')]
keep_only_tags=[
# dict(name='h1', attrs={'class':'size38 mt20 pb20'}),
dict(name='div', attrs={'class':'mainContainer'}),
# dict(name='p'),
# dict(attrs={'class':['size18 mt10 mb15', 'bold topicSize1', 'fromUsers content', 'textArticleDefault']})
]
remove_tags=[
dict(attrs={'class':['commentsInfo', 'textSize', 'related newsNews align-right', 'box', 'watchMaterial text', 'related galleryGallery align-center', 'advert block-alignment-right', 'userActions', 'socialBookmarks', 'im yourArticle fl', 'dynamicButton addComment fl', 'innerArticleModule onRight cols externalContent', 'thumbsGallery', 'relatedObject customBlockquote align-right', 'lead', 'mainRightColumn', 'articleDateContainer borderGreyBottom', 'socialMediaContainer onRight loaded', 'quizContent', 'twitter', 'facebook', 'googlePlus', 'share', 'voteResult', 'reportTitleBar bgBlue_v4 mb15', 'innerVideoModule center']}),
dict(name='article', attrs={'class':['singleArtPhotoCenter', 'singleArtPhotoRight', 'singleArtPhotoLeft']}),
dict(name='section', attrs={'id':['forum', 'innerArticle', 'quiz toCenter', 'mb20']}),
dict(name='div', attrs={'class':'socialMediaContainer big p20 mb20 borderGrey loaded'})
]
remove_tags_after=[dict(name='li', attrs={'class':'share'})]
feeds = [(u'Najnowsze', u'http://www.tvn24.pl/najnowsze.xml'), ]
#(u'Polska', u'www.tvn24.pl/polska.xml'), (u'\u015awiat', u'http://www.tvn24.pl/swiat.xml'), (u'Sport', u'http://www.tvn24.pl/sport.xml'), (u'Biznes', u'http://www.tvn24.pl/biznes.xml'), (u'Meteo', u'http://www.tvn24.pl/meteo.xml'), (u'Micha\u0142ki', u'http://www.tvn24.pl/michalki.xml'), (u'Kultura', u'http://www.tvn24.pl/kultura.xml')]
def preprocess_html(self, soup):
for item in soup.findAll(style=True):
del item['style']
tag = soup.find(name='ul', attrs={'class':'newsItem'})
if tag:
tag.name='div'
tag.li.name='div'
for alink in soup.findAll('a'):
if alink.string is not None:
tstr = alink.string
alink.replaceWith(tstr)
return soup
def postprocess_html(self, soup, first):
#process all the images
for tag in soup.findAll(lambda tag: tag.name.lower()=='img' and tag.has_key('src')):
iurl = tag['src']
img = Image()
img.open(iurl)
if img < 0:
raise RuntimeError('Out of memory')
img.type = "GrayscaleType"
img.save(iurl)
return soup

View File

@ -3,6 +3,8 @@
__license__ = 'GPL v3'
__copyright__ = '2010, matek09, matek09@gmail.com'
__copyright__ = 'Modified 2011, Mariusz Wolek <mariusz_dot_wolek @ gmail dot com>'
__copyright__ = 'Modified 2012, Artur Stachecki <artur.stachecki@gmail.com>'
from calibre.web.feeds.news import BasicNewsRecipe
import re
@ -11,7 +13,7 @@ class Wprost(BasicNewsRecipe):
EDITION = 0
FIND_LAST_FULL_ISSUE = True
EXCLUDE_LOCKED = True
ICO_BLOCKED = 'http://www.wprost.pl/G/icons/ico_blocked.gif'
ICO_BLOCKED = 'http://www.wprost.pl/G/layout2/ico_blocked.png'
title = u'Wprost'
__author__ = 'matek09'
@ -20,6 +22,7 @@ class Wprost(BasicNewsRecipe):
no_stylesheets = True
language = 'pl'
remove_javascript = True
recursions = 0
remove_tags_before = dict(dict(name = 'div', attrs = {'id' : 'print-layer'}))
remove_tags_after = dict(dict(name = 'div', attrs = {'id' : 'print-layer'}))
@ -35,13 +38,15 @@ class Wprost(BasicNewsRecipe):
(re.compile(r'\<td\>\<tr\>\<\/table\>'), lambda match: ''),
(re.compile(r'\<table .*?\>'), lambda match: ''),
(re.compile(r'\<tr>'), lambda match: ''),
(re.compile(r'\<td .*?\>'), lambda match: '')]
(re.compile(r'\<td .*?\>'), lambda match: ''),
(re.compile(r'\<div id="footer"\>.*?\</footer\>'), lambda match: '')]
remove_tags =[]
remove_tags.append(dict(name = 'div', attrs = {'class' : 'def element-date'}))
remove_tags.append(dict(name = 'div', attrs = {'class' : 'def silver'}))
remove_tags.append(dict(name = 'div', attrs = {'id' : 'content-main-column-right'}))
extra_css = '''
.div-header {font-size: x-small; font-weight: bold}
'''
@ -59,27 +64,26 @@ class Wprost(BasicNewsRecipe):
a = 0
if self.FIND_LAST_FULL_ISSUE:
ico_blocked = soup.findAll('img', attrs={'src' : self.ICO_BLOCKED})
a = ico_blocked[-1].findNext('a', attrs={'title' : re.compile('Zobacz spis tre.ci')})
a = ico_blocked[-1].findNext('a', attrs={'title' : re.compile(r'Spis *', re.IGNORECASE | re.DOTALL)})
else:
a = soup.find('a', attrs={'title' : re.compile('Zobacz spis tre.ci')})
a = soup.find('a', attrs={'title' : re.compile(r'Spis *', re.IGNORECASE | re.DOTALL)})
self.EDITION = a['href'].replace('/tygodnik/?I=', '')
self.cover_url = a.img['src']
self.EDITION_SHORT = a['href'].replace('/tygodnik/?I=15', '')
self.cover_url = a.img['src']
def parse_index(self):
self.find_last_issue()
soup = self.index_to_soup('http://www.wprost.pl/tygodnik/?I=' + self.EDITION)
feeds = []
for main_block in soup.findAll(attrs={'class':'main-block-s3 s3-head head-red3'}):
for main_block in soup.findAll(attrs={'id': 'content-main-column-element-content'}):
articles = list(self.find_articles(main_block))
if len(articles) > 0:
section = self.tag_to_string(main_block)
section = self.tag_to_string(main_block.find('h3'))
feeds.append((section, articles))
return feeds
def find_articles(self, main_block):
for a in main_block.findAllNext( attrs={'style':['','padding-top: 15px;']}):
for a in main_block.findAll('a'):
if a.name in "td":
break
if self.EXCLUDE_LOCKED & self.is_blocked(a):
@ -91,3 +95,4 @@ class Wprost(BasicNewsRecipe):
'description' : ''
}

View File

@ -1,5 +1,4 @@
from calibre.web.feeds.news import BasicNewsRecipe
from calibre.ebooks.BeautifulSoup import BeautifulSoup, Tag
class YemenTimesRecipe(BasicNewsRecipe):
__license__ = 'GPL v3'
@ -13,7 +12,7 @@ class YemenTimesRecipe(BasicNewsRecipe):
category = u'News, Opinion, Yemen'
description = u'Award winning weekly from Yemen, promoting press freedom, professional journalism and the defense of human rights.'
oldest_article = 7
oldest_article = 10
max_articles_per_feed = 100
use_embedded_content = False
encoding = 'utf-8'
@ -21,27 +20,13 @@ class YemenTimesRecipe(BasicNewsRecipe):
remove_empty_feeds = True
no_stylesheets = True
remove_javascript = True
auto_cleanup = True
keep_only_tags = []
keep_only_tags.append(dict(name = 'div', attrs = {'id': 'ctl00_ContentPlaceHolder1_MAINNEWS0_Panel1',
'class': 'DMAIN2'}))
remove_attributes = ['style']
INDEX = 'http://www.yementimes.com/'
feeds = []
feeds.append((u'Our Viewpoint', u'http://www.yementimes.com/DEFAULTSUB.ASPX?pnc=6&pnm=OUR%20VIEWPOINT'))
feeds.append((u'Local News', u'http://www.yementimes.com/DEFAULTSUB.ASPX?pnc=3&pnm=Local%20news'))
feeds.append((u'Their News', u'http://www.yementimes.com/DEFAULTSUB.ASPX?pnc=80&pnm=Their%20News'))
feeds.append((u'Report', u'http://www.yementimes.com/DEFAULTSUB.ASPX?pnc=8&pnm=report'))
feeds.append((u'Health', u'http://www.yementimes.com/DEFAULTSUB.ASPX?pnc=51&pnm=health'))
feeds.append((u'Interview', u'http://www.yementimes.com/DEFAULTSUB.ASPX?pnc=77&pnm=interview'))
feeds.append((u'Opinion', u'http://www.yementimes.com/DEFAULTSUB.ASPX?pnc=7&pnm=opinion'))
feeds.append((u'Business', u'http://www.yementimes.com/DEFAULTSUB.ASPX?pnc=5&pnm=business'))
feeds.append((u'Op-Ed', u'http://www.yementimes.com/DEFAULTSUB.ASPX?pnc=81&pnm=Op-Ed'))
feeds.append((u'Culture', u'http://www.yementimes.com/DEFAULTSUB.ASPX?pnc=75&pnm=Culture'))
feeds.append((u'Readers View', u'http://www.yementimes.com/DEFAULTSUB.ASPX?pnc=4&pnm=Readers%20View'))
feeds.append((u'Variety', u'http://www.yementimes.com/DEFAULTSUB.ASPX?pnc=9&pnm=Variety'))
feeds.append((u'Education', u'http://www.yementimes.com/DEFAULTSUB.ASPX?pnc=57&pnm=Education'))
feeds = [
('News',
'http://www.yementimes.com/?tpl=1341'),
]
extra_css = '''
body {font-family:verdana, arial, helvetica, geneva, sans-serif;}
@ -53,73 +38,4 @@ class YemenTimesRecipe(BasicNewsRecipe):
conversion_options = {'comments': description, 'tags': category, 'language': 'en',
'publisher': publisher, 'linearize_tables': True}
def get_browser(self):
br = BasicNewsRecipe.get_browser()
br.set_handle_gzip(True)
return br
def parse_index(self):
answer = []
for feed_title, feed in self.feeds:
soup = self.index_to_soup(feed)
newsbox = soup.find('div', 'newsbox')
main = newsbox.findNextSibling('table')
articles = []
for li in main.findAll('li'):
title = self.tag_to_string(li.a)
url = self.INDEX + li.a['href']
articles.append({'title': title, 'date': None, 'url': url, 'description': '<br/>&nbsp;'})
answer.append((feed_title, articles))
return answer
def preprocess_html(self, soup):
freshSoup = self.getFreshSoup(soup)
headline = soup.find('div', attrs = {'id': 'DVMTIT'})
if headline:
div = headline.findNext('div', attrs = {'id': 'DVTOP'})
img = None
if div:
img = div.find('img')
headline.name = 'h1'
freshSoup.body.append(headline)
if img is not None:
freshSoup.body.append(img)
byline = soup.find('div', attrs = {'id': 'DVTIT'})
if byline:
date_el = byline.find('span')
if date_el:
pub_date = self.tag_to_string(date_el)
date = Tag(soup, 'div', attrs = [('class', 'yemen_date')])
date.append(pub_date)
date_el.extract()
raw = '<br/>'.join(['%s' % (part) for part in byline.findAll(text = True)])
author = BeautifulSoup('<div class="yemen_byline">' + raw + '</div>')
if date is not None:
freshSoup.body.append(date)
freshSoup.body.append(author)
story = soup.find('div', attrs = {'id': 'DVDET'})
if story:
for table in story.findAll('table'):
if table.find('img'):
table['class'] = 'yemen_caption'
freshSoup.body.append(story)
return freshSoup
def getFreshSoup(self, oldSoup):
freshSoup = BeautifulSoup('<html><head><title></title></head><body></body></html>')
if oldSoup.head.title:
freshSoup.head.title.append(self.tag_to_string(oldSoup.head.title))
return freshSoup

Binary file not shown.

View File

@ -11,7 +11,6 @@ let g:syntastic_cpp_include_dirs = [
\'/usr/include/freetype2',
\'/usr/include/fontconfig',
\'src/qtcurve/common', 'src/qtcurve',
\'src/sfntly/src', 'src/sfntly/src/sample',
\'/usr/include/ImageMagick',
\]
let g:syntastic_c_include_dirs = g:syntastic_cpp_include_dirs

View File

@ -6,7 +6,7 @@ __license__ = 'GPL v3'
__copyright__ = '2009, Kovid Goyal <kovid@kovidgoyal.net>'
__docformat__ = 'restructuredtext en'
import os, socket, struct, subprocess
import os, socket, struct, subprocess, sys
from distutils.spawn import find_executable
from PyQt4 import pyqtconfig
@ -16,6 +16,7 @@ from setup import isosx, iswindows, islinux
OSX_SDK = '/Developer/SDKs/MacOSX10.5.sdk'
os.environ['MACOSX_DEPLOYMENT_TARGET'] = '10.5'
is64bit = sys.maxsize > 2**32
NMAKE = RC = msvc = MT = win_inc = win_lib = win_ddk = win_ddk_lib_dirs = None
if iswindows:

View File

@ -18,8 +18,7 @@ from setup.build_environment import (chmlib_inc_dirs,
msvc, MT, win_inc, win_lib, win_ddk, magick_inc_dirs, magick_lib_dirs,
magick_libs, chmlib_lib_dirs, sqlite_inc_dirs, icu_inc_dirs,
icu_lib_dirs, win_ddk_lib_dirs, ft_libs, ft_lib_dirs, ft_inc_dirs,
zlib_libs, zlib_lib_dirs, zlib_inc_dirs)
from setup.sfntly import SfntlyBuilderMixin
zlib_libs, zlib_lib_dirs, zlib_inc_dirs, is64bit)
MT
isunix = islinux or isosx or isbsd
@ -63,26 +62,8 @@ if isosx:
icu_libs = ['icucore']
icu_cflags = ['-DU_DISABLE_RENAMING'] # Needed to use system libicucore.dylib
class SfntlyExtension(Extension, SfntlyBuilderMixin):
def __init__(self, *args, **kwargs):
Extension.__init__(self, *args, **kwargs)
SfntlyBuilderMixin.__init__(self)
def preflight(self, *args, **kwargs):
self(*args, **kwargs)
extensions = [
SfntlyExtension('sfntly',
['calibre/utils/fonts/sfntly.cpp'],
headers= ['calibre/utils/fonts/sfntly.h'],
libraries=icu_libs,
lib_dirs=icu_lib_dirs,
inc_dirs=icu_inc_dirs,
cflags=icu_cflags
),
Extension('speedup',
['calibre/utils/speedup.c'],
),
@ -297,6 +278,8 @@ if iswindows:
ldflags = '/DLL /nologo /INCREMENTAL:NO /NODEFAULTLIB:libcmt.lib'.split()
#cflags = '/c /nologo /Ox /MD /W3 /EHsc /Zi'.split()
#ldflags = '/DLL /nologo /INCREMENTAL:NO /DEBUG'.split()
if is64bit:
cflags.append('/GS-')
for p in win_inc:
cflags.append('-I'+p)

View File

@ -301,7 +301,7 @@ class LinuxFreeze(Command):
export MAGICK_CONFIGURE_PATH=$lib/{1}/config
export MAGICK_CODER_MODULE_PATH=$lib/{1}/modules-Q16/coders
export MAGICK_CODER_FILTER_PATH=$lib/{1}/modules-Q16/filters
$base/bin/{0} "$@"
exec $base/bin/{0} "$@"
''')
dest = self.j(self.obj_dir, bname+'.o')

View File

@ -6,13 +6,11 @@ __license__ = 'GPL v3'
__copyright__ = '2009, Kovid Goyal <kovid@kovidgoyal.net>'
__docformat__ = 'restructuredtext en'
import os, shutil, subprocess, re
import os, shutil, subprocess
from setup import Command, __appname__, __version__
from setup.installer import VMInstaller
SIGNTOOL = r'C:\cygwin\home\kovid\sign.bat'
class Win(Command):
description = 'Build windows binary installers'
@ -38,11 +36,7 @@ class Win32(VMInstaller):
def sign_msi(self):
print ('Signing installers ...')
raw = open(self.VM).read()
vmx = re.search(r'''launch_vmware\(['"](.+?)['"]''', raw).group(1)
subprocess.check_call(['vmrun', '-T', 'ws', '-gu', 'kovid', '-gp',
"et tu brutus", 'runProgramInGuest', vmx, 'cmd.exe', '/C',
r'C:\cygwin\home\kovid\sign.bat'])
subprocess.check_call(['ssh', self.VM_NAME, '~/sign.sh'], shell=False)
def download_installer(self):
installer = self.installer()

View File

@ -13,12 +13,11 @@ from setup import (Command, modules, functions, basenames, __version__,
from setup.build_environment import msvc, MT, RC
from setup.installer.windows.wix import WixMixIn
ICU_DIR = r'Q:\icu'
OPENSSL_DIR = r'Q:\openssl'
QT_DIR = 'Q:\\Qt\\4.8.2'
ICU_DIR = os.environ.get('ICU_DIR', r'Q:\icu')
OPENSSL_DIR = os.environ.get('OPENSSL_DIR', r'Q:\openssl')
QT_DIR = os.environ.get('QT_DIR', 'Q:\\Qt\\4.8.2')
QT_DLLS = ['Core', 'Gui', 'Network', 'Svg', 'WebKit', 'Xml', 'XmlPatterns']
QTCURVE = r'C:\plugins\styles'
LIBUNRAR = 'C:\\Program Files\\UnrarDLL\\unrar.dll'
LIBUNRAR = os.environ.get('UNRARDLL', 'C:\\Program Files\\UnrarDLL\\unrar.dll')
SW = r'C:\cygwin\home\kovid\sw'
IMAGEMAGICK = os.path.join(SW, 'build', 'ImageMagick-6.7.6',
'VisualMagick', 'bin')
@ -262,8 +261,8 @@ class Win32Freeze(Command, WixMixIn):
print
print 'Adding third party dependencies'
print '\tAdding unrar'
shutil.copyfile(LIBUNRAR,
os.path.join(self.dll_dir, os.path.basename(LIBUNRAR)))
shutil.copyfile(LIBUNRAR, os.path.join(self.dll_dir,
os.path.basename(LIBUNRAR).replace('64', '')))
print '\tAdding misc binary deps'
bindir = os.path.join(SW, 'bin')
@ -278,8 +277,10 @@ class Win32Freeze(Command, WixMixIn):
if not ok: continue
dest = self.dll_dir
shutil.copy2(f, dest)
for x in ('zlib1.dll', 'libxml2.dll'):
shutil.copy2(self.j(bindir, x+'.manifest'), self.dll_dir)
for x in ('zlib1.dll', 'libxml2.dll', 'libxslt.dll', 'libexslt.dll'):
msrc = self.j(bindir, x+'.manifest')
if os.path.exists(msrc):
shutil.copy2(msrc, self.dll_dir)
# Copy ImageMagick
for pat in ('*.dll', '*.xml'):

View File

@ -4,16 +4,98 @@ Notes on setting up the windows development environment
Overview
----------
calibre and all its dependencies are compiled using Visual Studio 2008 express edition (free from MS). All the following instructions must be run in a visual studio command prompt unless otherwise noted.
calibre and all its dependencies are compiled using Visual Studio 2008. All the
following instructions must be run in a visual studio command prompt (the
various commands use unix notation, so if you want to use them directly, you
have to setup cygwin).
calibre contains build script to automate the building of the calibre installer. These scripts make certain assumptions about where dependencies are installed. Your best best is to setup a VM and replicate the paths mentioned below exactly.
calibre contains build script to automate the building of the calibre
installer. These scripts make certain assumptions about where dependencies are
installed. Your best best is to setup a VM and replicate the paths mentioned
below exactly.
Microsoft Visual Studio and Windows SDK
----------------------------------------
You have to use Visual Studio 2008 as that is the version Python 2.x works
with.
You need Visual Studio 2008 Express Edition for 32-bit and Professional for 64
bit.
1) Install Visual Studio
2) Install Visual Studio SP1 from http://www.microsoft.com/en-us/download/details.aspx?id=10986
(First check if the version of VS 2008 you have is not already SP1)
3) Install The Windows SDK. You need to install a version that is built for VS
2008. Get it from here: http://www.microsoft.com/en-us/download/details.aspx?id=3138
4) If you are building 64bit, edit the properties of the Visual Studio command
prompt shortcut to pass "amd64" instead of "x86" to the vsvars.bat file so that
it uses the 64 bit tools.
I've read that it is possible to use the 64-bit compiler that comes with the
Windows SDK With VS 2008 Express Edition, but I can't be bothered figuring it
out. Just use the Professional Edition.
Cygwin
------------
This is needed for automation of the build process, and the ease of use of the
unix shell (bash).
Install, vim, rsync, openssh, unzip, wget, make at a minimum.
After installing python run::
python setup/vcvars.py && echo 'source ~/.vcvars' >> ~/.bash_profile
To allow you to use the visual studio tools in the cygwin shell.
The following is only needed for automation (setting up ssh access to the
windows machine).
In order to build debug builds (.pdb files and sign files), you have to be able
to login as the normal user account with ssh. To do this, follow these steps:
* Setup a password for your user account
* Follow the steps here:
http://pcsupport.about.com/od/windows7/ht/auto-logon-windows-7.htm or
http://pcsupport.about.com/od/windowsxp/ht/auto-logon-xp.htm to allow the
machine to bootup without having to enter the password
* First clean out any existing cygwin ssh setup with::
net stop sshd
cygrunsrv -R sshd
net user sshd /DELETE
net user cyg_server /DELETE (delete any other cygwin users account you
can list them with net user)
rm -R /etc/ssh*
mkpasswd -cl > /etc/passwd
mkgroup --local > /etc/group
* Assign the necessary rights to the normal user account::
editrights.exe -a SeAssignPrimaryTokenPrivilege -u kovid
editrights.exe -a SeCreateTokenPrivilege -u kovid
editrights.exe -a SeTcbPrivilege -u kovid
editrights.exe -a SeServiceLogonRight -u kovid
* Run::
ssh-host-config
And answer (yes) to all questions. If it asks do you want to use a
different user name, specify the name of your user account and enter
username and password (it asks on Win 7 not on Win XP)
* On Windows XP, I also had to run::
passwd -R
to allow sshd to use my normal user account even with public key
authentication. See http://cygwin.com/cygwin-ug-net/ntsec.html for
details. On Windows 7 this wasn't necessary for some reason.
* Start sshd with::
net start sshd
* See http://www.kgx.net.nz/2010/03/cygwin-sshd-and-windows-7/ for details
Pass port 22 through Windows firewall. Create ~/.ssh/authorized_keys
Basic dependencies
--------------------
Install cygwin and setup sshd (optional). Used to enable automation of the calibre build VM from linux, not needed if you are building manually.
Install cmake, python, WiX (WiX is used to generate the .msi installer)
Install MS Visual Studio 2008, cmake, python and WiX.
You have to
Set CMAKE_PREFIX_PATH environment variable to C:\cygwin\home\kovid\sw
@ -21,10 +103,16 @@ This is where all dependencies will be installed.
Add C:\Python27\Scripts and C:\Python27 to PATH
Edit mimetypes.py in C:\Python27\Lib and set _winreg = None to prevent reading of mimetypes from the windows registry
Edit mimetypes.py in C:\Python27\Lib and set _winreg = None to prevent reading
of mimetypes from the windows registry
Install setuptools from http://pypi.python.org/pypi/setuptools
If there are no windows binaries already compiled for the version of python you are using then download the source and run the following command in the folder where the source has been unpacked::
Python packages
------------------
Install setuptools from http://pypi.python.org/pypi/setuptools If there are no
windows binaries already compiled for the version of python you are using then
download the source and run the following command in the folder where the
source has been unpacked::
python setup.py install
@ -32,10 +120,9 @@ Run the following command to install python dependencies::
easy_install --always-unzip -U mechanize pyreadline python-dateutil dnspython cssutils clientform pycrypto cssselect
Install BeautifulSoup 3.0.x manually into site-packages (3.1.x parses broken HTML very poorly)
Install pywin32 and edit win32com\__init__.py setting _frozen = True and
__gen_path__ to a temp dir (otherwise it tries to set it to a dir in the install tree which leads to permission errors)
__gen_path__ to a temp dir (otherwise it tries to set it to a dir in the
install tree which leads to permission errors)
Note that you should use::
import tempfile
@ -43,42 +130,58 @@ Note that you should use::
tempfile.gettempdir(), "gen_py",
"%d.%d" % (sys.version_info[0], sys.version_info[1]))
Use gettempdir instead of the win32 api method as gettempdir returns a temp dir that is guaranteed to actually work.
Use gettempdir instead of the win32 api method as gettempdir returns a temp dir
that is guaranteed to actually work.
Also edit win32com\client\gencache.py and change the except IOError on line 57 to catch all exceptions.
Also edit win32com\client\gencache.py and change the except IOError on line 57
to catch all exceptions.
SQLite
---------
Put sqlite3*.h from the sqlite windows amlgamation in ~/sw/include
Put sqlite3*.h from the sqlite windows amalgamation in ~/sw/include
APSW
-----
Download source from http://code.google.com/p/apsw/downloads/list and run in visual studio prompt
python setup.py fetch --all build --missing-checksum-ok --enable-all-extensions install test
python setup.py fetch --all --missing-checksum-ok build --enable-all-extensions install test
OpenSSL
--------
First install ActiveState Perl if you dont already have perl in windows
Download and untar the openssl tarball, follow the instructions in INSTALL.W32 (use no-asm)
Then, get nasm.exe from
http://www.nasm.us/pub/nasm/releasebuilds/2.05/nasm-2.05-win32.zip and put it
somewhere on your PATH (I chose ~/sw/bin)
Download and untar the openssl tarball, follow the instructions in INSTALL.(W32|W64)
to install use prefix q:\openssl
perl Configure VC-WIN32 no-asm enable-static-engine --prefix=Q:/openssl
ms\do_ms.bat
nmake -f ms\ntdll.mak
nmake -f ms\ntdll.mak test
nmake -f ms\ntdll.mak install
For 32-bit::
perl Configure VC-WIN32 no-asm enable-static-engine --prefix=Q:/openssl
ms\do_ms.bat
nmake -f ms\ntdll.mak
nmake -f ms\ntdll.mak test
nmake -f ms\ntdll.mak install
For 64-bit::
perl Configure VC-WIN64A no-asm enable-static-engine --prefix=C:/cygwin/home/kovid/sw/private/openssl
ms\do_win64a
nmake -f ms\ntdll.mak
nmake -f ms\ntdll.mak test
nmake -f ms\ntdll.mak install
Qt
--------
Download Qt sourcecode (.zip) from: http://qt-project.org/downloads
Extract Qt sourcecode to C:\Qt\current
Extract Qt sourcecode to C:\Qt\4.x.x.
Qt uses its own routine to locate and load "system libraries" including the openssl libraries needed for "Get Books". This means that we have to apply the following patch to have Qt load the openssl libraries bundled with calibre:
Qt uses its own routine to locate and load "system libraries" including the
openssl libraries needed for "Get Books". This means that we have to apply the
following patch to have Qt load the openssl libraries bundled with calibre:
--- src/corelib/plugin/qsystemlibrary.cpp 2011-02-22 05:04:00.000000000 -0700
@ -97,7 +200,7 @@ Now, run configure and make::
-no-plugin-manifests is needed so that loading the plugins does not fail looking for the CRT assembly
configure -ltcg -opensource -release -qt-zlib -qt-libmng -qt-libpng -qt-libtiff -qt-libjpeg -release -platform win32-msvc2008 -no-qt3support -webkit -xmlpatterns -no-phonon -no-style-plastique -no-style-cleanlooks -no-style-motif -no-style-cde -no-declarative -no-scripttools -no-audio-backend -no-multimedia -no-dbus -no-openvg -no-opengl -no-qt3support -confirm-license -nomake examples -nomake demos -nomake docs -no-plugin-manifests -openssl -I Q:\openssl\include -L Q:\openssl\lib && nmake
./configure.exe -ltcg -opensource -release -qt-zlib -qt-libmng -qt-libpng -qt-libtiff -qt-libjpeg -release -platform win32-msvc2008 -no-qt3support -webkit -xmlpatterns -no-phonon -no-style-plastique -no-style-cleanlooks -no-style-motif -no-style-cde -no-declarative -no-scripttools -no-audio-backend -no-multimedia -no-dbus -no-openvg -no-opengl -no-qt3support -confirm-license -nomake examples -nomake demos -nomake docs -nomake tools -no-plugin-manifests -openssl -I $OPENSSL_DIR/include -L $OPENSSL_DIR/lib && nmake
Add the path to the bin folder inside the Qt dir to your system PATH.
@ -106,9 +209,7 @@ SIP
Available from: http://www.riverbankcomputing.co.uk/software/sip/download ::
python configure.py -p win32-msvc2008
nmake
nmake install
python configure.py -p win32-msvc2008 && nmake && nmake install
PyQt4
----------
@ -119,15 +220,6 @@ Compiling instructions::
nmake
nmake install
Python Imaging Library
------------------------
Install as normal using installer at http://www.lfd.uci.edu/~gohlke/pythonlibs/
Test it on the target system with
calibre-debug -c "import _imaging, _imagingmath, _imagingft, _imagingcms"
ICU
-------
@ -151,71 +243,63 @@ Optionally run make check
Libunrar
----------
http://www.rarlab.com/rar/UnRARDLL.exe install and add C:\Program Files\UnrarDLL to PATH
Get the source from http://www.rarlab.com/rar_add.htm
lxml
------
Open UnrarDll.vcproj, change build type to release.
If building 64 bit change Win32 to x64.
http://pypi.python.org/pypi/lxml
Build the Solution, find the dll in the build subdir. As best as I can tell,
the vcproj already defines the SILENT preprocessor directive, but you should
test this.
jpeg-7
-------
.. http://www.rarlab.com/rar/UnRARDLL.exe install and add C:\Program Files\UnrarDLL to PATH
Copy::
jconfig.vc to jconfig.h, makejsln.vc9 to jpeg.sln,
makeasln.vc9 to apps.sln, makejvcp.vc9 to jpeg.vcproj,
makecvcp.vc9 to cjpeg.vcproj, makedvcp.vc9 to djpeg.vcproj,
maketvcp.vc9 to jpegtran.vcproj, makervcp.vc9 to rdjpgcom.vcproj, and
makewvcp.vc9 to wrjpgcom.vcproj. (Note that the renaming is critical!)
Load jpeg.sln in Visual Studio
Goto Project->Properties->General Properties and change Configuration Type to dll
Add
#define USE_WINDOWS_MESSAGEBOX
to jconfig.h (this will cause error messages to show up in a box)
Change the definitions of GLOBAL and EXTERN in jmorecfg.h to
#define GLOBAL(type) __declspec(dllexport) type
#define EXTERN(type) extern __declspec(dllexport) type
cp build/jpeg-7/Release/jpeg.dll bin/
cp build/jpeg-7/Release/jpeg.lib build/jpeg-7/Release/jpeg.exp
cp build/jpeg-7/jerror.h build/jpeg-7/jpeglib.h build/jpeg-7/jconfig.h build/jpeg-7/jmorecfg.h include/
TODO: 64-bit check that SILENT is defined and that the ctypes bindings actuall
work
zlib
------
nmake -f win32/Makefile.msc
nmake -f win32/Makefile.msc test
Build with::
nmake -f win32/Makefile.msc
nmake -f win32/Makefile.msc test
cp zlib1.dll* ../../bin
cp zlib.lib zdll.* ../../lib
cp zconf.h zlib.h ../../include
cp zlib1.dll* ../../bin
cp zlib.lib zdll.* ../../lib
cp zconf.h zlib.h ../../include
jpeg-8
-------
Get the source code from: http://sourceforge.net/projects/libjpeg-turbo/files/
Run::
chmod +x cmakescripts/* && cd build
cmake -G "NMake Makefiles" -DCMAKE_BUILD_TYPE=Release -DWITH_JPEG8=1 ..
nmake
cp sharedlib/jpeg8.dll* ~/sw/bin/
cp sharedlib/jpeg.lib ~/sw/lib/
cp jconfig.h ../jerror.h ../jpeglib.h ../jmorecfg.h ~/sw/include
libpng
---------
cp scripts/CMakelists.txt .
mkdir build
Run cmake-gui.exe with source directory . and build directory build
You will have to point to sw/lib/zdll.lib and sw/include for zlib
Also disable PNG_NO_STDIO and PNG_NO_CONSOLE_IO
Download the libpng .zip source file from:
http://www.libpng.org/pub/png/libpng.html
Now open PNG.sln in VS2008
Set Build type to Release
cp build/libpng-1.2.40/build/Release/libpng12.dll bin/
cp build/libpng-1.2.40/build/Release/png12.* lib/
cp build/libpng-1.2.40/png.h build/libpng-1.2.40/pngconf.h include/
Run::
mkdir build && cd build
cmake -G "NMake Makefiles" -DCMAKE_BUILD_TYPE=Release -DZLIB_INCLUDE_DIR=C:/cygwin/home/kovid/sw/include -DZLIB_LIBRARY=C:/cygwin/home/kovid/sw/lib/zdll.lib ..
nmake
cp libpng*.dll ~/sw/bin/
cp libpng*.lib ~/sw/lib/
cp pnglibconf.h ../png.h ../pngconf.h ~/sw/include/
freetype
-----------
Get the .zip source from: http://download.savannah.gnu.org/releases/freetype/
Edit *all copies* of the file ftoption.h and add to generate a .lib
and a correct dll
@ -225,38 +309,123 @@ and a correct dll
VS 2008 .sln file is present, open it
Change active build type to release mutithreaded
* If you are doing x64 build, click the Win32 dropdown, select
Configuration manager->Active solution platform -> New -> x64
Project->Properties->Configuration Properties
change configuration type to dll
* Change active build type to release mutithreaded
cp build/freetype-2.3.9/objs/release_mt/freetype.dll bin/
* Project->Properties->Configuration Properties change configuration type
to dll and build solution
cp "`find . -name *.dll`" ~/sw/bin/
cp "`find . -name freetype.lib`" ~/sw/lib/
Now change configuration back to static for .lib and build solution
cp "`find . -name freetype*MT.lib`" ~/sw/lib/
Now change configuration back to static for .lib
cp build/freetype-2.3.9/objs/win32/vc2008/freetype239MT.lib lib/
cp -rf build/freetype-2.3.9/include/* include/
cp -rf include/* ~/sw/include/
TODO: Test if this bloody thing actually works on 64 bit (apparently freetype
assumes sizeof(long) == sizeof(ptr) which is not true in Win64. See for
example: http://forum.openscenegraph.org/viewtopic.php?t=2880
expat
--------
Has a VC 6 project file expat.dsw
Get from: http://sourceforge.net/projects/expat/files/expat/
Set active build to Relase and change build type to dll
Apparently expat requires stdint.h which VS 2008 does not have. So we get our
own.
cp build/expat-2.0.1/win32/bin/Release/*.lib lib/
cp build/expat-2.0.1/win32/bin/Release/*.exp lib/
cp build/expat-2.0.1/win32/bin/Release/*.dll bin/
cp build/expat-2.0.1/lib/expat.h build/expat-2.0.1/lib/expat_external.h include/
Run::
cd lib
wget http://msinttypes.googlecode.com/svn/trunk/stdint.h
mkdir build && cd build
cmake -G "NMake Makefiles" -DCMAKE_BUILD_TYPE=Release ..
nmake
cp expat.dll ~/sw/bin/ && cp expat.lib ~/sw/lib/
cp ../lib/expat.h ../lib/expat_external.h ~/sw/include
libiconv
----------
Run::
mkdir vs2008 && cd vs2008
Then follow these instructions:
http://www.codeproject.com/Articles/302012/How-to-Build-libiconv-with-Microsoft-Visual-Studio
Change the type to Release and config to x64 or Win32 and Build solution and
then::
cp "`find . -name *.dll`" ~/sw/bin/
cp "`find . -name *.dll.manifest`" ~/sw/bin/
cp "`find . -name *.lib`" ~/sw/lib/iconv.lib
cp "`find . -name iconv.h`" ~/sw/include/
Information for using a static version of libiconv is at the link above.
libxml2
-------------
cd win32
cscript configure.js include=C:\cygwin\home\kovid\sw\include lib=C:\cygwin\home\sw\lib prefix=C:\cygwin\home\kovid\sw zlib=yes iconv=no
nmake /f Makefile.msvc
nmake /f Makefile.msvc install
mv lib/libxml2.dll bin/
cp ./build/libxml2-2.7.5/win32/bin.msvc/*.manifest bin/
Get it from: ftp://xmlsoft.org/libxml2/
Run::
cd win32
cscript.exe configure.js include=C:/cygwin/home/kovid/sw/include lib=C:/cygwin/home/kovid/sw/lib prefix=C:/cygwin/home/kovid/sw zlib=yes iconv=yes
nmake /f Makefile.msvc
mkdir -p ~/sw/include/libxml2/libxml
cp include/libxml/*.h ~/sw/include/libxml2/libxml/
find . -type f \( -name "*.dll" -o -name "*.dll.manifest" \) -exec cp "{}" ~/sw/bin/ \;
find . -name libxml2.lib -exec cp "{}" ~/sw/lib/ \;
libxslt
---------
Get it from: ftp://xmlsoft.org/libxml2/
Run::
cd win32
cscript.exe configure.js include=C:/cygwin/home/kovid/sw/include include=C:/cygwin/home/kovid/sw/include/libxml2 lib=C:/cygwin/home/kovid/sw/lib prefix=C:/cygwin/home/kovid/sw zlib=yes iconv=yes
nmake /f Makefile.msvc
mkdir -p ~/sw/include/libxslt ~/sw/include/libexslt
cp libxslt/*.h ~/sw/include/libxslt/
cp libexslt/*.h ~/sw/include/libexslt/
find . -type f \( -name "*.dll" -o -name "*.dll.manifest" \) -exec cp "{}" ~/sw/bin/ \;
find . -name lib*xslt.lib -exec cp "{}" ~/sw/lib/ \;
lxml
------
Get the source from: http://pypi.python.org/pypi/lxml
Add the following to the top of setupoptions.py::
if option == 'cflags':
return ['-IC:/cygwin/home/kovid/sw/include/libxml2',
'-IC:/cygwin/home/kovid/sw/include']
else:
return ['-LC:/cygwin/home/kovid/sw/lib']
Then, edit src/lxml/includes/etree_defs.h and change the section starting with
#ifndef LIBXML2_NEW_BUFFER
to
#ifdef LIBXML2_NEW_BUFFER
# define xmlBufContent(buf) xmlBufferContent(buf)
# define xmlBufLength(buf) xmlBufferLength(buf)
#endif
Run::
python setup.py install
Python Imaging Library
------------------------
Install as normal using installer at http://www.lfd.uci.edu/~gohlke/pythonlibs/
Test it on the target system with
calibre-debug -c "from PIL import _imaging, _imagingmath, _imagingft, _imagingcms"
kdewin32-msvc
----------------
@ -352,6 +521,8 @@ cp -r build/lib.win32-*/* /cygdrive/c/Python27/Lib/site-packages/
easylzma
----------
This is only needed to build the portable installer.
Get it from http://lloyd.github.com/easylzma/ (use the trunk version)
Run cmake and build the Visual Studio solution (generates CLI tools and dll and

File diff suppressed because it is too large Load Diff

View File

@ -9,14 +9,14 @@ msgstr ""
"Report-Msgid-Bugs-To: Debian iso-codes team <pkg-isocodes-"
"devel@lists.alioth.debian.org>\n"
"POT-Creation-Date: 2011-11-25 14:01+0000\n"
"PO-Revision-Date: 2012-04-18 13:08+0000\n"
"Last-Translator: Asier Iturralde Sarasola <Unknown>\n"
"PO-Revision-Date: 2012-10-29 14:16+0000\n"
"Last-Translator: gorkaazk <gorkaazkarate@euskalerria.org>\n"
"Language-Team: Euskara <itzulpena@comtropos.com>\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"
"X-Launchpad-Export-Date: 2012-04-19 04:36+0000\n"
"X-Generator: Launchpad (build 15108)\n"
"X-Launchpad-Export-Date: 2012-10-30 04:44+0000\n"
"X-Generator: Launchpad (build 16206)\n"
"Language: eu\n"
#. name for aaa
@ -73,7 +73,7 @@ msgstr "Anambé"
#. name for aao
msgid "Arabic; Algerian Saharan"
msgstr ""
msgstr "Arabiera, Aljeriako Saharakoa"
#. name for aap
msgid "Arára; Pará"
@ -181,31 +181,32 @@ msgstr "Abazera"
#. name for abr
msgid "Abron"
msgstr ""
msgstr "Abron; (bono hizkuntza, Ghana)"
#. name for abs
msgid "Malay; Ambonese"
msgstr ""
msgstr "Malaysiera; (\"ambonese\" hizkuntza)"
#. name for abt
msgid "Ambulas"
msgstr ""
msgstr "Ambulas hizkuntza"
#. name for abu
msgid "Abure"
msgstr ""
"Abure hizkuntza (edo abonwa; edo akaplass) (Niger, Kongo, Boli-kosta)"
#. name for abv
msgid "Arabic; Baharna"
msgstr ""
msgstr "Arabiera; baharna"
#. name for abw
msgid "Pal"
msgstr ""
msgstr "Pal hizkuntza (Papua)"
#. name for abx
msgid "Inabaknon"
msgstr ""
msgstr "Inabaknon hizkuntza (edo abaknon, Filipina uharteak)"
#. name for aby
msgid "Aneme Wake"

View File

@ -1,93 +0,0 @@
#!/usr/bin/env python
# vim:fileencoding=UTF-8:ts=4:sw=4:sta:et:sts=4:fdm=marker:ai
from __future__ import (unicode_literals, division, absolute_import,
print_function)
__license__ = 'GPL v3'
__copyright__ = '2012, Kovid Goyal <kovid at kovidgoyal.net>'
__docformat__ = 'restructuredtext en'
import shlex, os
from glob import glob
from setup import iswindows
class Group(object):
def __init__(self, name, base, build_base, cflags):
self.name = name
self.cflags = cflags
self.headers = frozenset(glob(os.path.join(base, '*.h')))
self.src_files = glob(os.path.join(base, '*.cc'))
self.bdir = os.path.abspath(os.path.join(build_base, name))
if not os.path.exists(self.bdir):
os.makedirs(self.bdir)
self.objects = [os.path.join(self.bdir,
os.path.basename(x).rpartition('.')[0] + ('.obj' if iswindows else
'.o')) for x in self.src_files]
def __call__(self, compiler, linker, builder, all_headers):
for src, obj in zip(self.src_files, self.objects):
if builder.newer(obj, [src] + list(all_headers)):
sinc = ['/Tp'+src] if iswindows else ['-c', src]
oinc = ['/Fo'+obj] if iswindows else ['-o', obj]
cmd = [compiler] + self.cflags + sinc + oinc
builder.info(' '.join(cmd))
builder.check_call(cmd)
class SfntlyBuilderMixin(object):
def __init__(self):
self.sfntly_cflags = [
'-DSFNTLY_NO_EXCEPTION',
'-DSFNTLY_EXPERIMENTAL',
]
if iswindows:
self.sfntly_cflags += [
'-D_UNICODE', '-DUNICODE',
] + shlex.split('/W4 /WX /Gm- /Gy /GR-')
self.cflags += ['-DWIN32']
else:
# Possibly add -fno-inline (slower, but more robust)
self.sfntly_cflags += [
'-Werror',
'-fno-exceptions',
]
if len(self.libraries) > 1:
self.libraries = ['icuuc']
if not iswindows:
self.libraries += ['pthread']
def __call__(self, obj_dir, compiler, linker, builder, cflags, ldflags):
self.sfntly_build_dir = os.path.join(obj_dir, 'sfntly')
if '/Ox' in cflags:
cflags.remove('/Ox')
if '-O3' in cflags:
cflags.remove('-O3')
if '/W3' in cflags:
cflags.remove('/W3')
if '-ggdb' not in cflags:
cflags.insert(0, '/O2' if iswindows else '-O2')
groups = []
all_headers = set()
all_objects = []
src_dir = self.absolutize([os.path.join('sfntly', 'src')])[0]
inc_dirs = [src_dir]
self.inc_dirs += inc_dirs
inc_flags = builder.inc_dirs_to_cflags(self.inc_dirs)
for loc in ('', 'port', 'data', 'math', 'table', 'table/bitmap',
'table/core', 'table/truetype'):
path = os.path.join(src_dir, 'sfntly', *loc.split('/'))
gr = Group(loc, path, self.sfntly_build_dir, cflags+
inc_flags+self.sfntly_cflags+self.cflags)
groups.append(gr)
all_headers |= gr.headers
all_objects.extend(gr.objects)
for group in groups:
group(compiler, linker, builder, all_headers)
self.extra_objs = all_objects

82
setup/vcvars.py Normal file
View File

@ -0,0 +1,82 @@
#!/usr/bin/env python
# vim:fileencoding=UTF-8:ts=4:sw=4:sta:et:sts=4:fdm=marker:ai
from __future__ import (unicode_literals, division, absolute_import,
print_function)
__license__ = 'GPL v3'
__copyright__ = '2012, Kovid Goyal <kovid at kovidgoyal.net>'
__docformat__ = 'restructuredtext en'
import os, sys, subprocess
from distutils.msvc9compiler import find_vcvarsall, get_build_version
plat = 'amd64' if sys.maxsize > 2**32 else 'x86'
def remove_dups(variable):
old_list = variable.split(os.pathsep)
new_list = []
for i in old_list:
if i not in new_list:
new_list.append(i)
return os.pathsep.join(new_list)
def query_process(cmd):
result = {}
popen = subprocess.Popen(cmd, stdout=subprocess.PIPE,
stderr=subprocess.PIPE)
try:
stdout, stderr = popen.communicate()
if popen.wait() != 0:
raise RuntimeError(stderr.decode("mbcs"))
stdout = stdout.decode("mbcs")
for line in stdout.splitlines():
if '=' not in line:
continue
line = line.strip()
key, value = line.split('=', 1)
key = key.lower()
if key == 'path':
if value.endswith(os.pathsep):
value = value[:-1]
value = remove_dups(value)
result[key] = value
finally:
popen.stdout.close()
popen.stderr.close()
return result
def query_vcvarsall():
vcvarsall = find_vcvarsall(get_build_version())
return query_process('"%s" %s & set' % (vcvarsall, plat))
env = query_vcvarsall()
paths = env['path'].split(';')
lib = env['lib']
include = env['include']
libpath = env['libpath']
def unix(paths):
up = []
for p in paths:
prefix, p = p.replace(os.sep, '/').partition('/')[0::2]
up.append('/cygdrive/%s/%s'%(prefix[0].lower(), p))
return ':'.join(up)
raw = '''\
#!/bin/sh
export PATH="%s:$PATH"
export LIB="%s"
export INCLUDE="%s"
export LIBPATH="%s"
'''%(unix(paths), lib, include, libpath)
with open(os.path.expanduser('~/.vcvars'), 'wb') as f:
f.write(raw.encode('utf-8'))

View File

@ -4,7 +4,7 @@ __license__ = 'GPL v3'
__copyright__ = '2008, Kovid Goyal kovid@kovidgoyal.net'
__docformat__ = 'restructuredtext en'
__appname__ = u'calibre'
numeric_version = (0, 9, 4)
numeric_version = (0, 9, 6)
__version__ = u'.'.join(map(unicode, numeric_version))
__author__ = u"Kovid Goyal <kovid@kovidgoyal.net>"
@ -91,7 +91,6 @@ class Plugins(collections.Mapping):
'speedup',
'freetype',
'woff',
'sfntly',
]
if iswindows:
plugins.extend(['winutil', 'wpd', 'winfonts'])

View File

@ -212,7 +212,7 @@ def main(args=sys.argv):
return
if len(args) > 1 and args[1] in ('-f', '--subset-font'):
from calibre.utils.fonts.subset import main
from calibre.utils.fonts.sfnt.subset import main
main(['subset-font']+args[2:])
return

View File

@ -18,7 +18,7 @@ from calibre.ebooks.metadata import author_to_author_sort
class Book(Book_):
def __init__(self, prefix, lpath, title=None, authors=None, mime=None, date=None, ContentType=None,
thumbnail_name=None, size=0, other=None):
thumbnail_name=None, size=None, other=None):
# debug_print('Book::__init__ - title=', title)
show_debug = title is not None and title.lower().find("xxxxx") >= 0
if show_debug:
@ -57,7 +57,7 @@ class Book(Book_):
except:
self.datetime = time.gmtime()
self.contentID = None
self.contentID = None
self.current_shelves = []
self.kobo_collections = []
@ -65,7 +65,8 @@ class Book(Book_):
self.thumbnail = ImageWrapper(thumbnail_name)
if show_debug:
debug_print("Book::__init__ - self=", self)
debug_print("Book::__init__ end - self=", self)
debug_print("Book::__init__ end - title=", title, 'authors=', authors)
class ImageWrapper(object):

View File

@ -517,7 +517,7 @@ class KOBO(USBMS):
lpath = lpath[1:]
#print "path: " + lpath
book = self.book_class(prefix, lpath, other=info)
if book.size is None:
if book.size is None or book.size == 0:
book.size = os.stat(self.normalize_path(path)).st_size
b = booklists[blist].add_book(book, replace_metadata=True)
if b:
@ -667,6 +667,7 @@ class KOBO(USBMS):
[_('Unknown')])
size = os.stat(cls.normalize_path(os.path.join(prefix, lpath))).st_size
book = cls.book_class(prefix, lpath, title, authors, mime, date, ContentType, ImageID, size=size, other=mi)
return book
def get_device_paths(self):
@ -1430,6 +1431,7 @@ class KOBOTOUCH(KOBO):
idx = bl_cache.get(lpath, None)
if idx is not None:# and not (accessibility == 1 and isdownloaded == 'false'):
if show_debug:
self.debug_index = idx
debug_print("KoboTouch:update_booklist - idx=%d"%idx)
debug_print('KoboTouch:update_booklist - bl[idx].device_collections=', bl[idx].device_collections)
debug_print('KoboTouch:update_booklist - playlist_map=', playlist_map)
@ -1440,8 +1442,9 @@ class KOBOTOUCH(KOBO):
debug_print('KoboTouch:update_booklist - the authors=', bl[idx].authors)
debug_print('KoboTouch:update_booklist - application_id=', bl[idx].application_id)
bl_cache[lpath] = None
if bl[idx].title_sort is not None:
bl[idx].title = bl[idx].title_sort
# removed to allow recognizing of ePub with an UUID inside
# if bl[idx].title_sort is not None:
# bl[idx].title = bl[idx].title_sort
if ImageID is not None:
imagename = self.imagefilename_from_imageID(ImageID)
if imagename is not None:
@ -1463,13 +1466,13 @@ class KOBOTOUCH(KOBO):
bl[idx].device_collections = playlist_map.get(lpath,[])
bl[idx].current_shelves = bookshelves
bl[idx].kobo_collections = kobo_collections
changed = True
if show_debug:
debug_print('KoboTouch:update_booklist - updated bl[idx].device_collections=', bl[idx].device_collections)
debug_print('KoboTouch:update_booklist - playlist_map=', playlist_map, 'changed=', changed)
# debug_print('KoboTouch:update_booklist - book=', bl[idx])
debug_print("KoboTouch:update_booklist - book class=%s"%bl[idx].__class__)
debug_print("KoboTouch:update_booklist - book title=%s"%bl[idx].title)
else:
if show_debug:
debug_print('KoboTouch:update_booklist - idx is none')
@ -1493,7 +1496,7 @@ class KOBOTOUCH(KOBO):
if show_debug:
debug_print('KoboTouch:update_booklist - class:', book.__class__)
# debug_print(' resolution:', book.__class__.__mro__)
debug_print(" contentid:'%s'"%book.contentID)
debug_print(" contentid: '%s'"%book.contentID)
debug_print(" title:'%s'"%book.title)
debug_print(" the book:", book)
debug_print(" author_sort:'%s'"%book.author_sort)
@ -1511,6 +1514,7 @@ class KOBOTOUCH(KOBO):
changed = True
if show_debug:
debug_print(' book.device_collections', book.device_collections)
debug_print(' book.title', book.title)
except: # Probably a path encoding error
import traceback
traceback.print_exc()
@ -1533,6 +1537,7 @@ class KOBOTOUCH(KOBO):
# debug_print("KoboTouch:get_bookshelvesforbook - count bookshelves=" + unicode(count_bookshelves))
return bookshelves
self.debug_index = 0
import sqlite3 as sqlite
with closing(sqlite.connect(
self.normalize_path(self._main_prefix +
@ -1634,8 +1639,11 @@ class KOBOTOUCH(KOBO):
# Do the operation in reverse order so indices remain valid
for idx in sorted(bl_cache.itervalues(), reverse=True):
if idx is not None:
need_sync = True
del bl[idx]
if not os.path.exists(self.normalize_path(os.path.join(prefix, bl[idx].lpath))):
need_sync = True
del bl[idx]
# else:
# debug_print("KoboTouch:books - Book in mtadata.calibre, on file system but not database - bl[idx].title:'%s'"%bl[idx].title)
#print "count found in cache: %d, count of files in metadata: %d, need_sync: %s" % \
# (len(bl_cache), len(bl), need_sync)
@ -1649,6 +1657,7 @@ class KOBOTOUCH(KOBO):
USBMS.sync_booklists(self, (None, bl, None))
else:
USBMS.sync_booklists(self, (bl, None, None))
debug_print("KoboTouch:books - have done sync_booklists")
self.report_progress(1.0, _('Getting list of books on device...'))
debug_print("KoboTouch:books - end - oncard='%s'"%oncard)
@ -1893,7 +1902,7 @@ class KOBOTOUCH(KOBO):
# debug_print("KoboTouch:update_device_database_collections - self.bookshelvelist=", self.bookshelvelist)
# Process any collections that exist
for category, books in collections.items():
debug_print("KoboTouch:update_device_database_collections - category='%s'"%category)
debug_print("KoboTouch:update_device_database_collections - category='%s' books=%d"%(category, len(books)))
if create_bookshelves and not (category in supportedcategories or category in readstatuslist or category in accessibilitylist):
self.check_for_bookshelf(connection, category)
# if category in self.bookshelvelist:
@ -1905,9 +1914,11 @@ class KOBOTOUCH(KOBO):
debug_print(' Title="%s"'%book.title, 'category="%s"'%category)
# debug_print(book)
debug_print(' class=%s'%book.__class__)
# debug_print(' resolution:', book.__class__.__mro__)
# debug_print(' subclasses:', book.__class__.__subclasses__())
debug_print(' book.contentID="%s"'%book.contentID)
debug_print(' book.application_id="%s"'%book.application_id)
if book.application_id is None:
continue
category_added = False
@ -1923,7 +1934,7 @@ class KOBOTOUCH(KOBO):
if category not in book.device_collections:
if show_debug:
debug_print(' Setting bookshelf on device')
self.set_bookshelf(connection, book.contentID, category)
self.set_bookshelf(connection, book, category)
category_added = True
elif category in readstatuslist.keys():
# Manage ReadStatus
@ -1955,12 +1966,10 @@ class KOBOTOUCH(KOBO):
else: # No collections
# Since no collections exist the ReadStatus needs to be reset to 0 (Unread)
debug_print("No Collections - reseting ReadStatus")
if oncard == 'carda':
debug_print("Booklists=", booklists)
if self.dbversion < 53:
self.reset_readstatus(connection, oncard)
if self.dbversion >= 14:
debug_print("No Collections - reseting FavouritesIndex")
debug_print("No Collections - resetting FavouritesIndex")
self.reset_favouritesindex(connection, oncard)
if self.supports_bookshelves():
@ -2188,16 +2197,23 @@ class KOBOTOUCH(KOBO):
return bookshelves
def set_bookshelf(self, connection, ContentID, bookshelf):
show_debug = self.is_debugging_title(ContentID)
def set_bookshelf(self, connection, book, shelfName):
show_debug = self.is_debugging_title(book.title)
if show_debug:
debug_print('KoboTouch:set_bookshelf ContentID=' + ContentID)
test_query = 'SELECT 1 FROM ShelfContent WHERE ShelfName = ? and ContentId = ?'
test_values = (bookshelf, ContentID, )
debug_print('KoboTouch:set_bookshelf book.ContentID="%s"'%book.contentID)
debug_print('KoboTouch:set_bookshelf book.current_shelves="%s"'%book.current_shelves)
if shelfName in book.current_shelves:
if show_debug:
debug_print(' book already on shelf.')
return
test_query = 'SELECT _IsDeleted FROM ShelfContent WHERE ShelfName = ? and ContentId = ?'
test_values = (shelfName, book.contentID, )
addquery = 'INSERT INTO ShelfContent ("ShelfName","ContentId","DateModified","_IsDeleted","_IsSynced") VALUES (?, ?, ?, "false", "false")'
add_values = (bookshelf, ContentID, time.strftime(self.TIMESTAMP_STRING, time.gmtime()), )
add_values = (shelfName, book.contentID, time.strftime(self.TIMESTAMP_STRING, time.gmtime()), )
updatequery = 'UPDATE ShelfContent SET _IsDeleted = "false" WHERE ShelfName = ? and ContentId = ?'
update_values = (bookshelf, ContentID, )
update_values = (shelfName, book.contentID, )
cursor = connection.cursor()
cursor.execute(test_query, test_values)
@ -2207,9 +2223,9 @@ class KOBOTOUCH(KOBO):
debug_print(' Did not find a record - adding')
cursor.execute(addquery, add_values)
connection.commit()
else:
elif result[0] == 'true':
if show_debug:
debug_print(' Found a record - updating')
debug_print(' Found a record - updating - result=', result)
cursor.execute(updatequery, update_values)
connection.commit()

View File

@ -12,6 +12,7 @@ import os
import cStringIO
from calibre.constants import isosx
from calibre.devices.usbms.driver import USBMS
class NOOK(USBMS):
@ -84,6 +85,8 @@ class NOOK_COLOR(NOOK):
description = _('Communicate with the Nook Color, TSR and Tablet eBook readers.')
PRODUCT_ID = [0x002, 0x003, 0x004]
if isosx:
PRODUCT_ID.append(0x005) # Nook HD+
BCD = [0x216]
WINDOWS_MAIN_MEM = WINDOWS_CARD_A_MEM = ['EBOOK_DISK', 'NOOK_TABLET',

View File

@ -14,6 +14,7 @@ device. This class handles device detection.
import os, subprocess, time, re, sys, glob
from itertools import repeat
from calibre import prints, as_unicode
from calibre.devices.interface import DevicePlugin
from calibre.devices.errors import DeviceError
from calibre.devices.usbms.deviceconfig import DeviceConfig
@ -901,8 +902,11 @@ class Device(DeviceConfig, DevicePlugin):
for d in drives:
try:
winutil.eject_drive(bytes(d)[0])
except:
pass
except Exception as e:
try:
prints("Eject failed:", as_unicode(e))
except:
pass
t = Thread(target=do_it, args=[drives])
t.daemon = True

View File

@ -133,6 +133,7 @@ def add_pipeline_options(parser, plumber):
[
'base_font_size', 'disable_font_rescaling',
'font_size_mapping', 'embed_font_family',
'subset_embedded_fonts',
'line_height', 'minimum_line_height',
'linearize_tables',
'extra_css', 'filter_css',

View File

@ -150,8 +150,15 @@ class EPUBInput(InputFormatPlugin):
from calibre import walk
from calibre.ebooks import DRMError
from calibre.ebooks.metadata.opf2 import OPF
zf = ZipFile(stream)
zf.extractall(os.getcwdu())
try:
zf = ZipFile(stream)
zf.extractall(os.getcwdu())
except:
log.exception('EPUB appears to be invalid ZIP file, trying a'
' more forgiving ZIP parser')
from calibre.utils.localunzip import extractall
stream.seek(0)
extractall(stream)
encfile = os.path.abspath(os.path.join('META-INF', 'encryption.xml'))
opf = self.find_opf()
if opf is None:

View File

@ -144,6 +144,22 @@ class EPUBOutput(OutputFormatPlugin):
for u in XPath('//h:u')(root):
u.tag = 'span'
u.set('style', 'text-decoration:underline')
seen_ids, seen_names = set(), set()
for x in XPath('//*[@id or @name]')(root):
eid, name = x.get('id', None), x.get('name', None)
if eid:
if eid in seen_ids:
del x.attrib['id']
else:
seen_ids.add(eid)
if name:
if name in seen_names:
del x.attrib['name']
else:
seen_names.add(name)
# }}}
def convert(self, oeb, output_path, input_plugin, opts, log):

View File

@ -204,6 +204,15 @@ OptionRecommendation(name='embed_font_family',
'with some output formats, principally EPUB and AZW3.')
),
OptionRecommendation(name='subset_embedded_fonts',
recommended_value=False, level=OptionRecommendation.LOW,
help=_(
'Subset all embedded fonts. Every embedded font is reduced '
'to contain only the glyphs used in this document. This decreases '
'the size of the font files. Useful if you are embedding a '
'particularly large font with lots of unused glyphs.')
),
OptionRecommendation(name='linearize_tables',
recommended_value=False, level=OptionRecommendation.LOW,
help=_('Some badly designed documents use tables to control the '
@ -1112,6 +1121,10 @@ OptionRecommendation(name='search_replace',
RemoveFakeMargins()(self.oeb, self.log, self.opts)
RemoveAdobeMargins()(self.oeb, self.log, self.opts)
if self.opts.subset_embedded_fonts:
from calibre.ebooks.oeb.transforms.subset import SubsetFonts
SubsetFonts()(self.oeb, self.log, self.opts)
pr(0.9)
self.flush()

View File

@ -10,6 +10,7 @@ from cStringIO import StringIO
from contextlib import closing
from calibre.utils.zipfile import ZipFile, BadZipfile, safe_replace
from calibre.utils.localunzip import LocalZipFile
from calibre.ebooks.BeautifulSoup import BeautifulStoneSoup
from calibre.ebooks.metadata import MetaInformation
from calibre.ebooks.metadata.opf2 import OPF
@ -105,10 +106,13 @@ class OCFReader(OCF):
class OCFZipReader(OCFReader):
def __init__(self, stream, mode='r', root=None):
try:
self.archive = ZipFile(stream, mode=mode)
except BadZipfile:
raise EPubException("not a ZIP .epub OCF container")
if isinstance(stream, (LocalZipFile, ZipFile)):
self.archive = stream
else:
try:
self.archive = ZipFile(stream, mode=mode)
except BadZipfile:
raise EPubException("not a ZIP .epub OCF container")
self.root = root
if self.root is None:
name = getattr(stream, 'name', False)
@ -119,8 +123,18 @@ class OCFZipReader(OCFReader):
super(OCFZipReader, self).__init__()
def open(self, name, mode='r'):
if isinstance(self.archive, LocalZipFile):
return self.archive.open(name)
return StringIO(self.archive.read(name))
def get_zip_reader(stream, root=None):
try:
zf = ZipFile(stream, mode='r')
except:
stream.seek(0)
zf = LocalZipFile(stream)
return OCFZipReader(zf, root=root)
class OCFDirReader(OCFReader):
def __init__(self, path):
self.root = path
@ -184,7 +198,12 @@ def render_cover(opf, opf_path, zf, reader=None):
def get_cover(opf, opf_path, stream, reader=None):
raster_cover = opf.raster_cover
stream.seek(0)
zf = ZipFile(stream)
try:
zf = ZipFile(stream)
except:
stream.seek(0)
zf = LocalZipFile(stream)
if raster_cover:
base = posixpath.dirname(opf_path)
cpath = posixpath.normpath(posixpath.join(base, raster_cover))
@ -207,7 +226,7 @@ def get_cover(opf, opf_path, stream, reader=None):
def get_metadata(stream, extract_cover=True):
""" Return metadata as a :class:`Metadata` object """
stream.seek(0)
reader = OCFZipReader(stream)
reader = get_zip_reader(stream)
mi = reader.opf.to_book_metadata()
if extract_cover:
try:
@ -232,7 +251,7 @@ def _write_new_cover(new_cdata, cpath):
def set_metadata(stream, mi, apply_null=False, update_timestamp=False):
stream.seek(0)
reader = OCFZipReader(stream, root=os.getcwdu())
reader = get_zip_reader(stream, root=os.getcwdu())
raster_cover = reader.opf.raster_cover
mi = MetaInformation(mi)
new_cdata = None
@ -283,7 +302,11 @@ def set_metadata(stream, mi, apply_null=False, update_timestamp=False):
reader.opf.timestamp = mi.timestamp
newopf = StringIO(reader.opf.render())
safe_replace(stream, reader.container[OPF.MIMETYPE], newopf,
if isinstance(reader.archive, LocalZipFile):
reader.archive.safe_replace(reader.container[OPF.MIMETYPE], newopf,
extra_replacements=replacements)
else:
safe_replace(stream, reader.container[OPF.MIMETYPE], newopf,
extra_replacements=replacements)
try:
if cpath is not None:

View File

@ -71,7 +71,7 @@ class PagedDisplay
this.margin_side = margin_side
this.margin_bottom = margin_bottom
layout: () ->
layout: (is_single_page=false) ->
# start_time = new Date().getTime()
body_style = window.getComputedStyle(document.body)
bs = document.body.style
@ -151,6 +151,8 @@ class PagedDisplay
has_svg = document.getElementsByTagName('svg').length > 0
only_img = document.getElementsByTagName('img').length == 1 and document.getElementsByTagName('div').length < 3 and document.getElementsByTagName('p').length < 2
this.is_full_screen_layout = (only_img or has_svg) and single_screen and document.body.scrollWidth > document.body.clientWidth
if is_single_page
this.is_full_screen_layout = true
this.in_paged_mode = true
this.current_margin_side = sm

View File

@ -126,6 +126,7 @@ class EbookIterator(BookmarksMixin):
self.spine = []
Spiny = partial(SpineItem, read_anchor_map=read_anchor_map,
run_char_count=run_char_count)
is_comic = plumber.input_fmt.lower() in {'cbc', 'cbz', 'cbr', 'cb7'}
for i in ordered:
spath = i.path
mt = None
@ -135,6 +136,8 @@ class EbookIterator(BookmarksMixin):
mt = guess_type(spath)[0]
try:
self.spine.append(Spiny(spath, mime_type=mt))
if is_comic:
self.spine[-1].is_single_page = True
except:
self.log.warn('Missing spine item:', repr(spath))

View File

@ -53,6 +53,7 @@ class SpineItem(unicode):
if mime_type is None:
mime_type = guess_type(obj)[0]
obj.mime_type = mime_type
obj.is_single_page = None
return obj
class IndexEntry(object):

View File

@ -0,0 +1,284 @@
#!/usr/bin/env python
# vim:fileencoding=UTF-8:ts=4:sw=4:sta:et:sts=4:fdm=marker:ai
from __future__ import (unicode_literals, division, absolute_import,
print_function)
__license__ = 'GPL v3'
__copyright__ = '2012, Kovid Goyal <kovid at kovidgoyal.net>'
__docformat__ = 'restructuredtext en'
from collections import defaultdict
from calibre.ebooks.oeb.base import urlnormalize
from calibre.utils.fonts.sfnt.subset import subset, NoGlyphs, UnsupportedFont
class SubsetFonts(object):
'''
Subset all embedded fonts. Must be run after CSS flattening, as it requires
CSS normalization and flattening to work.
'''
def __call__(self, oeb, log, opts):
self.oeb, self.log, self.opts = oeb, log, opts
self.find_embedded_fonts()
if not self.embedded_fonts:
self.log.debug('No embedded fonts found')
return
self.find_style_rules()
self.find_font_usage()
totals = [0, 0]
def remove(font):
totals[1] += len(font['item'].data)
self.oeb.manifest.remove(font['item'])
font['rule'].parentStyleSheet.deleteRule(font['rule'])
for font in self.embedded_fonts:
if not font['chars']:
self.log('The font %s is unused. Removing it.'%font['src'])
remove(font)
continue
try:
raw, old_stats, new_stats = subset(font['item'].data, font['chars'])
except NoGlyphs:
self.log('The font %s has no used glyphs. Removing it.'%font['src'])
remove(font)
continue
except UnsupportedFont as e:
self.log.warn('The font %s is unsupported for subsetting. %s'%(
font['src'], e))
sz = len(font['item'].data)
totals[0] += sz
totals[1] += sz
else:
font['item'].data = raw
nlen = sum(new_stats.itervalues())
olen = sum(old_stats.itervalues())
self.log('Decreased the font %s to %.1f%% of its original size'%
(font['src'], nlen/olen *100))
totals[0] += nlen
totals[1] += olen
font['item'].unload_data_from_memory()
if totals[0]:
self.log('Reduced total font size to %.1f%% of original'%
(totals[0]/totals[1] * 100))
def get_font_properties(self, rule, default=None):
'''
Given a CSS rule, extract normalized font properties from
it. Note that shorthand font property should already have been expanded
by the CSS flattening code.
'''
props = {}
s = rule.style
for q in ('font-family', 'src', 'font-weight', 'font-stretch',
'font-style'):
g = 'uri' if q == 'src' else 'value'
try:
val = s.getProperty(q).propertyValue[0]
val = getattr(val, g)
if q == 'font-family':
val = [x.value for x in s.getProperty(q).propertyValue]
if val and val[0] == 'inherit':
val = None
except (IndexError, KeyError, AttributeError, TypeError, ValueError):
val = None if q in {'src', 'font-family'} else default
if q in {'font-weight', 'font-stretch', 'font-style'}:
val = unicode(val).lower() if (val or val == 0) else val
if val == 'inherit':
val = default
if q == 'font-weight':
val = {'normal':'400', 'bold':'700'}.get(val, val)
if val not in {'100', '200', '300', '400', '500', '600', '700',
'800', '900', 'bolder', 'lighter'}:
val = default
if val == 'normal': val = '400'
elif q == 'font-style':
if val not in {'normal', 'italic', 'oblique'}:
val = default
elif q == 'font-stretch':
if val not in { 'normal', 'ultra-condensed', 'extra-condensed',
'condensed', 'semi-condensed', 'semi-expanded',
'expanded', 'extra-expanded', 'ultra-expanded'}:
val = default
props[q] = val
return props
def find_embedded_fonts(self):
'''
Find all @font-face rules and extract the relevant info from them.
'''
self.embedded_fonts = []
for item in self.oeb.manifest:
if not hasattr(item.data, 'cssRules'): continue
for i, rule in enumerate(item.data.cssRules):
if rule.type != rule.FONT_FACE_RULE:
continue
props = self.get_font_properties(rule, default='normal')
if not props['font-family'] or not props['src']:
continue
path = item.abshref(props['src'])
ff = self.oeb.manifest.hrefs.get(urlnormalize(path), None)
if not ff:
continue
props['item'] = ff
if props['font-weight'] in {'bolder', 'lighter'}:
props['font-weight'] = '400'
props['weight'] = int(props['font-weight'])
props['chars'] = set()
props['rule'] = rule
self.embedded_fonts.append(props)
def find_style_rules(self):
'''
Extract all font related style information from all stylesheets into a
dict mapping classes to font properties specified by that class. All
the heavy lifting has already been done by the CSS flattening code.
'''
rules = defaultdict(dict)
for item in self.oeb.manifest:
if not hasattr(item.data, 'cssRules'): continue
for i, rule in enumerate(item.data.cssRules):
if rule.type != rule.STYLE_RULE:
continue
props = {k:v for k,v in
self.get_font_properties(rule).iteritems() if v}
if not props:
continue
for sel in rule.selectorList:
sel = sel.selectorText
if sel and sel.startswith('.'):
# We dont care about pseudo-selectors as the worst that
# can happen is some extra characters will remain in
# the font
sel = sel.partition(':')[0]
rules[sel[1:]].update(props)
self.style_rules = dict(rules)
def find_font_usage(self):
for item in self.oeb.manifest:
if not hasattr(item.data, 'xpath'): continue
for body in item.data.xpath('//*[local-name()="body"]'):
base = {'font-family':['serif'], 'font-weight': '400',
'font-style':'normal', 'font-stretch':'normal'}
self.find_usage_in(body, base)
def elem_style(self, cls, inherited_style):
'''
Find the effective style for the given element.
'''
classes = cls.split()
style = inherited_style.copy()
for cls in classes:
style.update(self.style_rules.get(cls, {}))
wt = style.get('font-weight', None)
pwt = inherited_style.get('font-weight', '400')
if wt == 'bolder':
style['font-weight'] = {
'100':'400',
'200':'400',
'300':'400',
'400':'700',
'500':'700',
}.get(pwt, '900')
elif wt == 'lighter':
style['font-weight'] = {
'600':'400', '700':'400',
'800':'700', '900':'700'}.get(pwt, '100')
return style
def used_font(self, style):
'''
Given a style find the embedded font that matches it. Returns None if
no match is found ( can happen if not family matches).
'''
ff = style.get('font-family', [])
lnames = {x.lower() for x in ff}
matching_set = []
# Filter on font-family
for ef in self.embedded_fonts:
flnames = {x.lower() for x in ef.get('font-family', [])}
if not lnames.intersection(flnames):
continue
matching_set.append(ef)
if not matching_set:
return None
# Filter on font-stretch
widths = {x:i for i, x in enumerate(( 'ultra-condensed',
'extra-condensed', 'condensed', 'semi-condensed', 'normal',
'semi-expanded', 'expanded', 'extra-expanded', 'ultra-expanded'
))}
width = widths[style.get('font-stretch', 'normal')]
for f in matching_set:
f['width'] = widths[style.get('font-stretch', 'normal')]
min_dist = min(abs(width-f['width']) for f in matching_set)
nearest = [f for f in matching_set if abs(width-f['width']) ==
min_dist]
if width <= 4:
lmatches = [f for f in nearest if f['width'] <= width]
else:
lmatches = [f for f in nearest if f['width'] >= width]
matching_set = (lmatches or nearest)
# Filter on font-style
fs = style.get('font-style', 'normal')
order = {
'oblique':['oblique', 'italic', 'normal'],
'normal':['normal', 'oblique', 'italic']
}.get(fs, ['italic', 'oblique', 'normal'])
for q in order:
matches = [f for f in matching_set if f.get('font-style', 'normal')
== q]
if matches:
matching_set = matches
break
# Filter on font weight
fw = int(style.get('font-weight', '400'))
if fw == 400:
q = [400, 500, 300, 200, 100, 600, 700, 800, 900]
elif fw == 500:
q = [500, 400, 300, 200, 100, 600, 700, 800, 900]
elif fw < 400:
q = [fw] + list(xrange(fw-100, -100, -100)) + list(xrange(fw+100,
100, 1000))
else:
q = [fw] + list(xrange(fw+100, 100, 1000)) + list(xrange(fw-100,
-100, -100))
for wt in q:
matches = [f for f in matching_set if f['weight'] == wt]
if matches:
return matches[0]
def find_chars(self, elem):
ans = set()
if elem.text:
ans |= set(elem.text)
for child in elem:
if child.tail:
ans |= set(child.tail)
return ans
def find_usage_in(self, elem, inherited_style):
style = self.elem_style(elem.get('class', ''), inherited_style)
for child in elem:
self.find_usage_in(child, style)
font = self.used_font(style)
if font:
chars = self.find_chars(elem)
if chars:
font['chars'] |= chars

View File

@ -120,17 +120,19 @@ class ShareConnMenu(QMenu): # {{{
for account in keys:
formats, auto, default = opts.accounts[account]
subject = opts.subjects.get(account, '')
alias = opts.aliases.get(account, '')
dest = 'mail:'+account+';'+formats+';'+subject
action1 = DeviceAction(dest, False, False, I('mail.png'),
account)
alias or account)
action2 = DeviceAction(dest, True, False, I('mail.png'),
account + ' ' + _('(delete from library)'))
(alias or account) + ' ' + _('(delete from library)'))
self.email_to_menu.addAction(action1)
self.email_to_and_delete_menu.addAction(action2)
map(self.memory.append, (action1, action2))
if default:
ac = DeviceAction(dest, False, False,
I('mail.png'), _('Email to') + ' ' +account)
I('mail.png'), _('Email to') + ' ' +(alias or
account))
self.addAction(ac)
self.email_actions.append(ac)
ac.a_s.connect(sync_menu.action_triggered)

View File

@ -239,10 +239,11 @@ class PluginWidget(QWidget,Ui_Form):
def initialize(self, name, db):
'''
CheckBoxControls (c_type: check_box):
['generate_titles','generate_series','generate_genres',
'generate_recently_added','generate_descriptions','include_hr']
['cross_reference_authors',
'generate_titles','generate_series','generate_genres',
'generate_recently_added','generate_descriptions',
'include_hr']
ComboBoxControls (c_type: combo_box):
['exclude_source_field','header_note_source_field',
'merge_source_field']

View File

@ -305,7 +305,7 @@ The default pattern \[.+\]|\+ excludes tags of the form [tag], e.g., [Test book]
<string>Other options</string>
</property>
<layout class="QGridLayout" name="gridLayout_3">
<item row="2" column="1">
<item row="3" column="1">
<layout class="QHBoxLayout" name="merge_with_comments_hl">
<item>
<widget class="QComboBox" name="merge_source_field">
@ -372,7 +372,7 @@ The default pattern \[.+\]|\+ excludes tags of the form [tag], e.g., [Test book]
</item>
</layout>
</item>
<item row="2" column="0">
<item row="3" column="0">
<widget class="QLabel" name="label_9">
<property name="minimumSize">
<size>
@ -397,7 +397,7 @@ The default pattern \[.+\]|\+ excludes tags of the form [tag], e.g., [Test book]
</property>
</widget>
</item>
<item row="0" column="0">
<item row="1" column="0">
<widget class="QLabel" name="label_4">
<property name="minimumSize">
<size>
@ -413,7 +413,7 @@ The default pattern \[.+\]|\+ excludes tags of the form [tag], e.g., [Test book]
</property>
</widget>
</item>
<item row="0" column="1">
<item row="1" column="1">
<layout class="QHBoxLayout" name="replace_cover_hl">
<item>
<widget class="QRadioButton" name="generate_new_cover">
@ -447,7 +447,7 @@ The default pattern \[.+\]|\+ excludes tags of the form [tag], e.g., [Test book]
</item>
</layout>
</item>
<item row="1" column="0">
<item row="2" column="0">
<widget class="QLabel" name="label_3">
<property name="text">
<string>E&amp;xtra Description note:</string>
@ -460,7 +460,7 @@ The default pattern \[.+\]|\+ excludes tags of the form [tag], e.g., [Test book]
</property>
</widget>
</item>
<item row="1" column="1">
<item row="2" column="1">
<layout class="QHBoxLayout" name="horizontalLayout">
<item>
<widget class="QComboBox" name="header_note_source_field">
@ -561,6 +561,27 @@ The default pattern \[.+\]|\+ excludes tags of the form [tag], e.g., [Test book]
</item>
</layout>
</item>
<item row="0" column="0">
<widget class="QLabel" name="label_2">
<property name="text">
<string>Author cross-references:</string>
</property>
<property name="alignment">
<set>Qt::AlignRight|Qt::AlignTrailing|Qt::AlignVCenter</set>
</property>
</widget>
</item>
<item row="0" column="1">
<layout class="QHBoxLayout" name="cross_references_hl">
<item>
<widget class="QCheckBox" name="cross_reference_authors">
<property name="text">
<string>For books with multiple authors, list each author separately</string>
</property>
</widget>
</item>
</layout>
</item>
</layout>
</widget>
</item>

View File

@ -32,7 +32,7 @@ class LookAndFeelWidget(Widget, Ui_Form):
Widget.__init__(self, parent,
['change_justification', 'extra_css', 'base_font_size',
'font_size_mapping', 'line_height', 'minimum_line_height',
'embed_font_family',
'embed_font_family', 'subset_embedded_fonts',
'smarten_punctuation', 'unsmarten_punctuation',
'disable_font_rescaling', 'insert_blank_line',
'remove_paragraph_spacing',

View File

@ -6,7 +6,7 @@
<rect>
<x>0</x>
<y>0</y>
<width>655</width>
<width>699</width>
<height>619</height>
</rect>
</property>
@ -406,7 +406,14 @@
</widget>
</item>
<item row="6" column="1" colspan="2">
<widget class="FontFamilyChooser" name="opt_embed_font_family"/>
<widget class="FontFamilyChooser" name="opt_embed_font_family" native="true"/>
</item>
<item row="6" column="3" colspan="2">
<widget class="QCheckBox" name="opt_subset_embedded_fonts">
<property name="text">
<string>&amp;Subset all embedded fonts (Experimental)</string>
</property>
</widget>
</item>
</layout>
</widget>

View File

@ -225,7 +225,7 @@ class TemplateDialog(QDialog, Ui_TemplateDialog):
self.mi.series_index = 3
self.mi.rating = 4.0
self.mi.tags = [_('Tag 1'), _('Tag 2')]
self.mi.language = ['eng']
self.mi.languages = ['eng']
# Remove help icon on title bar
icon = self.windowIcon()

View File

@ -7,11 +7,16 @@ __license__ = 'GPL v3'
__copyright__ = '2012, Kovid Goyal <kovid at kovidgoyal.net>'
__docformat__ = 'restructuredtext en'
import os, shutil
from PyQt4.Qt import (QFontInfo, QFontMetrics, Qt, QFont, QFontDatabase, QPen,
QStyledItemDelegate, QSize, QStyle, QStringListModel, pyqtSignal,
QDialog, QVBoxLayout, QApplication, QFontComboBox, QPushButton,
QToolButton, QGridLayout, QListView, QWidget, QDialogButtonBox, QIcon,
QHBoxLayout, QLabel, QModelIndex)
QHBoxLayout, QLabel, QModelIndex, QLineEdit)
from calibre.constants import config_dir
from calibre.gui2 import choose_files, error_dialog, info_dialog
def writing_system_for_font(font):
has_latin = True
@ -167,19 +172,12 @@ class FontFamilyDialog(QDialog):
self.setWindowIcon(QIcon(I('font.png')))
from calibre.utils.fonts.scanner import font_scanner
self.font_scanner = font_scanner
try:
self.families = list(font_scanner.find_font_families())
except:
self.families = []
print ('WARNING: Could not load fonts')
import traceback
traceback.print_exc()
self.families.insert(0, _('None'))
self.m = QStringListModel(self)
self.build_font_list()
self.l = l = QGridLayout()
self.setLayout(l)
self.view = FontsView(self)
self.m = QStringListModel(self.families)
self.view.setModel(self.m)
self.view.setCurrentIndex(self.m.index(0))
if current_family:
@ -194,17 +192,109 @@ class FontFamilyDialog(QDialog):
self.bb = QDialogButtonBox(QDialogButtonBox.Ok|QDialogButtonBox.Cancel)
self.bb.accepted.connect(self.accept)
self.bb.rejected.connect(self.reject)
self.add_fonts_button = afb = self.bb.addButton(_('Add &fonts'),
self.bb.ActionRole)
afb.setIcon(QIcon(I('plus.png')))
afb.clicked.connect(self.add_fonts)
self.ml = QLabel(_('Choose a font family from the list below:'))
self.search = QLineEdit(self)
self.search.setPlaceholderText(_('Search'))
self.search.returnPressed.connect(self.find)
self.nb = QToolButton(self)
self.nb.setIcon(QIcon(I('arrow-down.png')))
self.nb.setToolTip(_('Find Next'))
self.pb = QToolButton(self)
self.pb.setIcon(QIcon(I('arrow-up.png')))
self.pb.setToolTip(_('Find Previous'))
self.nb.clicked.connect(self.find_next)
self.pb.clicked.connect(self.find_previous)
l.addWidget(self.ml, 0, 0, 1, 2)
l.addWidget(self.view, 1, 0, 1, 1)
l.addWidget(self.faces, 1, 1, 1, 1)
l.addWidget(self.bb, 2, 0, 1, 2)
l.addWidget(self.ml, 0, 0, 1, 4)
l.addWidget(self.search, 1, 0, 1, 1)
l.addWidget(self.nb, 1, 1, 1, 1)
l.addWidget(self.pb, 1, 2, 1, 1)
l.addWidget(self.view, 2, 0, 1, 3)
l.addWidget(self.faces, 1, 3, 2, 1)
l.addWidget(self.bb, 3, 0, 1, 4)
l.setAlignment(self.faces, Qt.AlignTop)
self.resize(800, 600)
def set_current(self, i):
self.view.setCurrentIndex(self.m.index(i))
def keyPressEvent(self, e):
if e.key() == Qt.Key_Return:
return
return QDialog.keyPressEvent(self, e)
def find(self, backwards=False):
i = self.view.currentIndex().row()
if i < 0: i = 0
q = icu_lower(unicode(self.search.text())).strip()
if not q: return
r = (xrange(i-1, -1, -1) if backwards else xrange(i+1,
len(self.families)))
for j in r:
f = self.families[j]
if q in icu_lower(f):
self.set_current(j)
return
def find_next(self):
self.find()
def find_previous(self):
self.find(backwards=True)
def build_font_list(self):
try:
self.families = list(self.font_scanner.find_font_families())
except:
self.families = []
print ('WARNING: Could not load fonts')
import traceback
traceback.print_exc()
self.families.insert(0, _('None'))
self.m.setStringList(self.families)
def add_fonts(self):
from calibre.utils.fonts.metadata import FontMetadata
files = choose_files(self, 'add fonts to calibre',
_('Select font files'), filters=[(_('TrueType/OpenType Fonts'),
['ttf', 'otf'])], all_files=False)
if not files: return
families = set()
for f in files:
try:
with open(f, 'rb') as stream:
fm = FontMetadata(stream)
except:
import traceback
error_dialog(self, _('Corrupt font'),
_('Failed to read metadata from the font file: %s')%
f, det_msg=traceback.format_exc(), show=True)
return
families.add(fm.font_family)
families = sorted(families)
dest = os.path.join(config_dir, 'fonts')
for f in files:
shutil.copyfile(f, os.path.join(dest, os.path.basename(f)))
self.font_scanner.do_scan()
self.build_font_list()
self.m.reset()
self.view.setCurrentIndex(self.m.index(0))
if families:
for i, val in enumerate(self.families):
if icu_lower(val) == icu_lower(families[0]):
self.view.setCurrentIndex(self.m.index(i))
break
info_dialog(self, _('Added fonts'),
_('Added font families: %s')%(
', '.join(families)), show=True)
@property
def font_family(self):
idx = self.view.currentIndex().row()

View File

@ -7,8 +7,8 @@ __docformat__ = 'restructuredtext en'
import functools
from PyQt4.Qt import Qt, QStackedWidget, QMenu, \
QSize, QSizePolicy, QStatusBar, QLabel, QFont
from PyQt4.Qt import (Qt, QStackedWidget, QMenu, QTimer,
QSize, QSizePolicy, QStatusBar, QLabel, QFont)
from calibre.utils.config import prefs
from calibre.constants import (isosx, __appname__, preferred_encoding,
@ -274,7 +274,7 @@ class LayoutMixin(object): # {{{
m = self.library_view.model()
if m.rowCount(None) > 0:
self.library_view.set_current_row(0)
QTimer.singleShot(0, self.library_view.set_current_row)
m.current_changed(self.library_view.currentIndex(),
self.library_view.currentIndex())
self.library_view.setFocus(Qt.OtherFocusReason)

View File

@ -777,7 +777,7 @@ class BooksView(QTableView): # {{{
self.scrollTo(self.model().index(row, i), self.PositionAtCenter)
break
def set_current_row(self, row, select=True):
def set_current_row(self, row=0, select=True):
if row > -1 and row < self.model().rowCount(QModelIndex()):
h = self.horizontalHeader()
logical_indices = list(range(h.count()))

View File

@ -188,6 +188,10 @@ class MetadataSingleDialogBase(ResizableDialog):
self.tags_editor_button.setToolTip(_('Open Tag Editor'))
self.tags_editor_button.setIcon(QIcon(I('chapters.png')))
self.tags_editor_button.clicked.connect(self.tags_editor)
self.clear_tags_button = QToolButton(self)
self.clear_tags_button.setToolTip(_('Clear all tags'))
self.clear_tags_button.setIcon(QIcon(I('trash.png')))
self.clear_tags_button.clicked.connect(self.tags.clear)
self.basic_metadata_widgets.append(self.tags)
self.identifiers = IdentifiersEdit(self)
@ -656,9 +660,10 @@ class MetadataSingleDialog(MetadataSingleDialogBase): # {{{
l.addItem(self.tabs[0].spc_one, 1, 0, 1, 3)
sto(self.cover.buttons[-1], self.rating)
create_row2(1, self.rating)
sto(self.rating, self.tags)
create_row2(2, self.tags, self.tags_editor_button)
sto(self.tags_editor_button, self.paste_isbn_button)
sto(self.rating, self.tags_editor_button)
sto(self.tags_editor_button, self.tags)
create_row2(2, self.tags, self.clear_tags_button, front_button=self.tags_editor_button)
sto(self.clear_tags_button, self.paste_isbn_button)
sto(self.paste_isbn_button, self.identifiers)
create_row2(3, self.identifiers, self.clear_identifiers_button,
front_button=self.paste_isbn_button)
@ -761,6 +766,7 @@ class MetadataSingleDialogAlt1(MetadataSingleDialogBase): # {{{
tl.addWidget(self.swap_title_author_button, 0, 0, 2, 1)
tl.addWidget(self.manage_authors_button, 2, 0, 1, 1)
tl.addWidget(self.paste_isbn_button, 12, 0, 1, 1)
tl.addWidget(self.tags_editor_button, 6, 0, 1, 1)
create_row(0, self.title, self.title_sort,
button=self.deduce_title_sort_button, span=2,
@ -773,7 +779,7 @@ class MetadataSingleDialogAlt1(MetadataSingleDialogBase): # {{{
create_row(4, self.series, self.series_index,
button=self.clear_series_button, icon='trash.png')
create_row(5, self.series_index, self.tags)
create_row(6, self.tags, self.rating, button=self.tags_editor_button)
create_row(6, self.tags, self.rating, button=self.clear_tags_button)
create_row(7, self.rating, self.pubdate)
create_row(8, self.pubdate, self.publisher,
button=self.pubdate.clear_button, icon='trash.png')
@ -785,7 +791,8 @@ class MetadataSingleDialogAlt1(MetadataSingleDialogBase): # {{{
button=self.clear_identifiers_button, icon='trash.png')
sto(self.clear_identifiers_button, self.swap_title_author_button)
sto(self.swap_title_author_button, self.manage_authors_button)
sto(self.manage_authors_button, self.paste_isbn_button)
sto(self.manage_authors_button, self.tags_editor_button)
sto(self.tags_editor_button, self.paste_isbn_button)
tl.addItem(QSpacerItem(1, 1, QSizePolicy.Fixed, QSizePolicy.Expanding),
13, 1, 1 ,1)
@ -896,6 +903,7 @@ class MetadataSingleDialogAlt2(MetadataSingleDialogBase): # {{{
tl.addWidget(self.swap_title_author_button, 0, 0, 2, 1)
tl.addWidget(self.manage_authors_button, 2, 0, 2, 1)
tl.addWidget(self.paste_isbn_button, 12, 0, 1, 1)
tl.addWidget(self.tags_editor_button, 6, 0, 1, 1)
create_row(0, self.title, self.title_sort,
button=self.deduce_title_sort_button, span=2,
@ -908,7 +916,7 @@ class MetadataSingleDialogAlt2(MetadataSingleDialogBase): # {{{
create_row(4, self.series, self.series_index,
button=self.clear_series_button, icon='trash.png')
create_row(5, self.series_index, self.tags)
create_row(6, self.tags, self.rating, button=self.tags_editor_button)
create_row(6, self.tags, self.rating, button=self.clear_tags_button)
create_row(7, self.rating, self.pubdate)
create_row(8, self.pubdate, self.publisher,
button=self.pubdate.clear_button, icon='trash.png')
@ -920,7 +928,8 @@ class MetadataSingleDialogAlt2(MetadataSingleDialogBase): # {{{
button=self.clear_identifiers_button, icon='trash.png')
sto(self.clear_identifiers_button, self.swap_title_author_button)
sto(self.swap_title_author_button, self.manage_authors_button)
sto(self.manage_authors_button, self.paste_isbn_button)
sto(self.manage_authors_button, self.tags_editor_button)
sto(self.tags_editor_button, self.paste_isbn_button)
tl.addItem(QSpacerItem(1, 1, QSizePolicy.Fixed, QSizePolicy.Expanding),
13, 1, 1 ,1)

View File

@ -19,12 +19,14 @@ from calibre.utils.smtp import config as smtp_prefs
class EmailAccounts(QAbstractTableModel): # {{{
def __init__(self, accounts, subjects):
def __init__(self, accounts, subjects, aliases={}):
QAbstractTableModel.__init__(self)
self.accounts = accounts
self.subjects = subjects
self.aliases = aliases
self.account_order = sorted(self.accounts.keys())
self.headers = map(QVariant, [_('Email'), _('Formats'), _('Subject'), _('Auto send')])
self.headers = map(QVariant, [_('Email'), _('Formats'), _('Subject'),
_('Auto send'), _('Alias')])
self.default_font = QFont()
self.default_font.setBold(True)
self.default_font = QVariant(self.default_font)
@ -36,7 +38,9 @@ class EmailAccounts(QAbstractTableModel): # {{{
'{author_sort} can be used here.'),
'<p>'+_('If checked, downloaded news will be automatically '
'mailed <br>to this email address '
'(provided it is in one of the listed formats).')])))
'(provided it is in one of the listed formats).'),
_('Friendly name to use for this email address')
])))
def rowCount(self, *args):
return len(self.account_order)
@ -67,6 +71,8 @@ class EmailAccounts(QAbstractTableModel): # {{{
return QVariant(self.accounts[account][0])
if col == 2:
return QVariant(self.subjects.get(account, ''))
if col == 4:
return QVariant(self.aliases.get(account, ''))
if role == Qt.FontRole and self.accounts[account][2]:
return self.default_font
if role == Qt.CheckStateRole and col == 3:
@ -88,6 +94,11 @@ class EmailAccounts(QAbstractTableModel): # {{{
self.accounts[account][1] ^= True
elif col == 2:
self.subjects[account] = unicode(value.toString())
elif col == 4:
self.aliases.pop(account, None)
aval = unicode(value.toString()).strip()
if aval:
self.aliases[account] = aval
elif col == 1:
self.accounts[account][0] = unicode(value.toString()).upper()
elif col == 0:
@ -156,7 +167,8 @@ class ConfigWidget(ConfigWidgetBase, Ui_Form):
self.send_email_widget.initialize(self.preferred_to_address)
self.send_email_widget.changed_signal.connect(self.changed_signal.emit)
opts = self.send_email_widget.smtp_opts
self._email_accounts = EmailAccounts(opts.accounts, opts.subjects)
self._email_accounts = EmailAccounts(opts.accounts, opts.subjects,
opts.aliases)
self._email_accounts.dataChanged.connect(lambda x,y:
self.changed_signal.emit())
self.email_view.setModel(self._email_accounts)
@ -184,6 +196,7 @@ class ConfigWidget(ConfigWidgetBase, Ui_Form):
raise AbortCommit('abort')
self.proxy['accounts'] = self._email_accounts.accounts
self.proxy['subjects'] = self._email_accounts.subjects
self.proxy['aliases'] = self._email_accounts.aliases
return ConfigWidgetBase.commit(self)

View File

@ -18,6 +18,7 @@ from calibre.customize.ui import (initialized_plugins, is_disabled, enable_plugi
remove_plugin, NameConflict)
from calibre.gui2 import (NONE, error_dialog, info_dialog, choose_files,
question_dialog, gprefs)
from calibre.gui2.dialogs.confirm_delete import confirm
from calibre.utils.search_query_parser import SearchQueryParser
from calibre.utils.icu import lower
from calibre.constants import iswindows
@ -363,6 +364,12 @@ class ConfigWidget(ConfigWidgetBase, Ui_Form):
if plugin.do_user_config(self.gui):
self._plugin_model.refresh_plugin(plugin)
elif op == 'remove':
if not confirm('<p>' +
_('Are you sure you want to remove the plugin: %s?')%
'<b>{0}</b>'.format(plugin.name),
'confirm_plugin_removal_msg', parent=self):
return
msg = _('Plugin <b>{0}</b> successfully removed').format(plugin.name)
if remove_plugin(plugin):
self._plugin_model.populate()
@ -403,7 +410,12 @@ class ConfigWidget(ConfigWidgetBase, Ui_Form):
return
all_locations = OrderedDict(ConfigWidget.LOCATIONS)
plugin_action = plugin.load_actual_plugin(self.gui)
try:
plugin_action = plugin.load_actual_plugin(self.gui)
except:
# Broken plugin, fails to initialize. Given that, it's probably
# already configured, so we can just quit.
return
installed_actions = OrderedDict([
(key, list(gprefs.get('action-layout-'+key, [])))
for key in all_locations])

View File

@ -6,102 +6,19 @@ __license__ = 'GPL 3'
__copyright__ = '2011, John Schember <john@nachtimwald.com>'
__docformat__ = 'restructuredtext en'
from contextlib import closing
from calibre.gui2.store.stores.amazon_uk_plugin import AmazonUKKindleStore
from lxml import html
from PyQt4.Qt import QUrl
from calibre import browser
from calibre.gui2 import open_url
from calibre.gui2.store import StorePlugin
from calibre.gui2.store.search_result import SearchResult
class AmazonDEKindleStore(StorePlugin):
class AmazonDEKindleStore(AmazonUKKindleStore):
'''
For comments on the implementation, please see amazon_plugin.py
'''
def open(self, parent=None, detail_item=None, external=False):
aff_id = {'tag': 'charhale0a-21'}
store_link = ('http://www.amazon.de/gp/redirect.html?ie=UTF8&site-redirect=de'
'&tag=%(tag)s&linkCode=ur2&camp=1638&creative=19454'
'&location=http://www.amazon.de/ebooks-kindle/b?node=530886031') % aff_id
if detail_item:
aff_id['asin'] = detail_item
store_link = ('http://www.amazon.de/gp/redirect.html?ie=UTF8'
aff_id = {'tag': 'charhale0a-21'}
store_link = ('http://www.amazon.de/gp/redirect.html?ie=UTF8&site-redirect=de'
'&tag=%(tag)s&linkCode=ur2&camp=1638&creative=19454'
'&location=http://www.amazon.de/ebooks-kindle/b?node=530886031')
store_link_details = ('http://www.amazon.de/gp/redirect.html?ie=UTF8'
'&location=http://www.amazon.de/dp/%(asin)s&site-redirect=de'
'&tag=%(tag)s&linkCode=ur2&camp=1638&creative=6742') % aff_id
open_url(QUrl(store_link))
'&tag=%(tag)s&linkCode=ur2&camp=1638&creative=6742')
search_url = 'http://www.amazon.de/s/?url=search-alias%3Ddigital-text&field-keywords='
def search(self, query, max_results=10, timeout=60):
search_url = 'http://www.amazon.de/s/?url=search-alias%3Ddigital-text&field-keywords='
url = search_url + query.encode('ascii', 'backslashreplace').replace('%', '%25').replace('\\x', '%').replace(' ', '+')
br = browser()
counter = max_results
with closing(br.open(url, timeout=timeout)) as f:
# doc = html.fromstring(f.read().decode('latin-1', 'replace'))
# Apparently amazon Europe is responding in UTF-8 now
doc = html.fromstring(f.read())
data_xpath = '//div[contains(@class, "result") and contains(@class, "product")]'
format_xpath = './/span[@class="format"]/text()'
cover_xpath = './/img[@class="productImage"]/@src'
for data in doc.xpath(data_xpath):
if counter <= 0:
break
# Even though we are searching digital-text only Amazon will still
# put in results for non Kindle books (author pages). So we need
# to explicitly check if the item is a Kindle book and ignore it
# if it isn't.
format = ''.join(data.xpath(format_xpath))
if 'kindle' not in format.lower():
continue
# We must have an asin otherwise we can't easily reference the
# book later.
asin = ''.join(data.xpath("@name"))
cover_url = ''.join(data.xpath(cover_xpath))
title = ''.join(data.xpath('.//a[@class="title"]/text()'))
price = ''.join(data.xpath('.//div[@class="newPrice"]/span[contains(@class, "price")]/text()'))
author = ''.join(data.xpath('.//h3[@class="title"]/span[@class="ptBrand"]/text()'))
if author.startswith('von '):
author = author[4:]
counter -= 1
s = SearchResult()
s.cover_url = cover_url.strip()
s.title = title.strip()
s.author = author.strip()
s.price = price.strip()
s.detail_item = asin.strip()
s.formats = 'Kindle'
yield s
def get_details(self, search_result, timeout):
drm_search_text = u'Gleichzeitige Verwendung von Geräten'
drm_free_text = u'Keine Einschränkung'
url = 'http://amazon.de/dp/'
br = browser()
with closing(br.open(url + search_result.detail_item, timeout=timeout)) as nf:
idata = html.fromstring(nf.read())
if idata.xpath('boolean(//div[@class="content"]//li/b[contains(text(), "' +
drm_search_text + '")])'):
if idata.xpath('boolean(//div[@class="content"]//li[contains(., "' +
drm_free_text + '") and contains(b, "' +
drm_search_text + '")])'):
search_result.drm = SearchResult.DRM_UNLOCKED
else:
search_result.drm = SearchResult.DRM_UNKNOWN
else:
search_result.drm = SearchResult.DRM_LOCKED
return True

View File

@ -6,78 +6,17 @@ __license__ = 'GPL 3'
__copyright__ = '2011, John Schember <john@nachtimwald.com>'
__docformat__ = 'restructuredtext en'
from contextlib import closing
from calibre.gui2.store.stores.amazon_uk_plugin import AmazonUKKindleStore
from lxml import html
from PyQt4.Qt import QUrl
from calibre import browser
from calibre.gui2 import open_url
from calibre.gui2.store import StorePlugin
from calibre.gui2.store.search_result import SearchResult
class AmazonESKindleStore(StorePlugin):
class AmazonESKindleStore(AmazonUKKindleStore):
'''
For comments on the implementation, please see amazon_plugin.py
'''
def open(self, parent=None, detail_item=None, external=False):
aff_id = {'tag': 'charhale09-21'}
store_link = 'http://www.amazon.es/ebooks-kindle/b?_encoding=UTF8&node=827231031&tag=%(tag)s&ie=UTF8&linkCode=ur2&camp=3626&creative=24790' % aff_id
if detail_item:
aff_id['asin'] = detail_item
store_link = 'http://www.amazon.es/gp/redirect.html?ie=UTF8&location=http://www.amazon.es/dp/%(asin)s&tag=%(tag)s&linkCode=ur2&camp=3626&creative=24790' % aff_id
open_url(QUrl(store_link))
def search(self, query, max_results=10, timeout=60):
search_url = 'http://www.amazon.es/s/?url=search-alias%3Ddigital-text&field-keywords='
url = search_url + query.encode('ascii', 'backslashreplace').replace('%', '%25').replace('\\x', '%').replace(' ', '+')
br = browser()
counter = max_results
with closing(br.open(url, timeout=timeout)) as f:
# doc = html.fromstring(f.read().decode('latin-1', 'replace'))
# Apparently amazon Europe is responding in UTF-8 now
doc = html.fromstring(f.read())
data_xpath = '//div[contains(@class, "result") and contains(@class, "product")]'
format_xpath = './/span[@class="format"]/text()'
cover_xpath = './/img[@class="productImage"]/@src'
for data in doc.xpath(data_xpath):
if counter <= 0:
break
# Even though we are searching digital-text only Amazon will still
# put in results for non Kindle books (author pages). So we need
# to explicitly check if the item is a Kindle book and ignore it
# if it isn't.
format = ''.join(data.xpath(format_xpath))
if 'kindle' not in format.lower():
continue
# We must have an asin otherwise we can't easily reference the
# book later.
asin = ''.join(data.xpath("@name"))
cover_url = ''.join(data.xpath(cover_xpath))
title = ''.join(data.xpath('.//a[@class="title"]/text()'))
price = ''.join(data.xpath('.//div[@class="newPrice"]/span[contains(@class, "price")]/text()'))
author = unicode(''.join(data.xpath('.//h3[@class="title"]/span[@class="ptBrand"]/text()')))
if author.startswith('de '):
author = author[3:]
counter -= 1
s = SearchResult()
s.cover_url = cover_url.strip()
s.title = title.strip()
s.author = author.strip()
s.price = price.strip()
s.detail_item = asin.strip()
s.formats = 'Kindle'
s.drm = SearchResult.DRM_UNKNOWN
yield s
aff_id = {'tag': 'charhale09-21'}
store_link = ('http://www.amazon.es/ebooks-kindle/b?_encoding=UTF8&'
'node=827231031&tag=%(tag)s&ie=UTF8&linkCode=ur2&camp=3626&creative=24790')
store_link_details = ('http://www.amazon.es/gp/redirect.html?ie=UTF8&'
'location=http://www.amazon.es/dp/%(asin)s&tag=%(tag)s'
'&linkCode=ur2&camp=3626&creative=24790')
search_url = 'http://www.amazon.es/s/?url=search-alias%3Ddigital-text&field-keywords='

View File

@ -6,79 +6,16 @@ __license__ = 'GPL 3'
__copyright__ = '2011, John Schember <john@nachtimwald.com>'
__docformat__ = 'restructuredtext en'
from contextlib import closing
from lxml import html
from calibre.gui2.store.stores.amazon_uk_plugin import AmazonUKKindleStore
from PyQt4.Qt import QUrl
from calibre import browser
from calibre.gui2 import open_url
from calibre.gui2.store import StorePlugin
from calibre.gui2.store.search_result import SearchResult
class AmazonFRKindleStore(StorePlugin):
class AmazonFRKindleStore(AmazonUKKindleStore):
'''
For comments on the implementation, please see amazon_plugin.py
'''
def open(self, parent=None, detail_item=None, external=False):
aff_id = {'tag': 'charhale-21'}
store_link = 'http://www.amazon.fr/livres-kindle/b?ie=UTF8&node=695398031&ref_=sa_menu_kbo1&_encoding=UTF8&tag=%(tag)s&linkCode=ur2&camp=1642&creative=19458' % aff_id
aff_id = {'tag': 'charhale-21'}
store_link = 'http://www.amazon.fr/livres-kindle/b?ie=UTF8&node=695398031&ref_=sa_menu_kbo1&_encoding=UTF8&tag=%(tag)s&linkCode=ur2&camp=1642&creative=19458' % aff_id
store_link_details = 'http://www.amazon.fr/gp/redirect.html?ie=UTF8&location=http://www.amazon.fr/dp/%(asin)s&tag=%(tag)s&linkCode=ur2&camp=1634&creative=6738'
search_url = 'http://www.amazon.fr/s/?url=search-alias%3Ddigital-text&field-keywords='
if detail_item:
aff_id['asin'] = detail_item
store_link = 'http://www.amazon.fr/gp/redirect.html?ie=UTF8&location=http://www.amazon.fr/dp/%(asin)s&tag=%(tag)s&linkCode=ur2&camp=1634&creative=6738' % aff_id
open_url(QUrl(store_link))
def search(self, query, max_results=10, timeout=60):
search_url = 'http://www.amazon.fr/s/?url=search-alias%3Ddigital-text&field-keywords='
url = search_url + query.encode('ascii', 'backslashreplace').replace('%', '%25').replace('\\x', '%').replace(' ', '+')
br = browser()
counter = max_results
with closing(br.open(url, timeout=timeout)) as f:
# doc = html.fromstring(f.read().decode('latin-1', 'replace'))
# Apparently amazon Europe is responding in UTF-8 now
doc = html.fromstring(f.read())
data_xpath = '//div[contains(@class, "result") and contains(@class, "product")]'
format_xpath = './/span[@class="format"]/text()'
cover_xpath = './/img[@class="productImage"]/@src'
for data in doc.xpath(data_xpath):
if counter <= 0:
break
# Even though we are searching digital-text only Amazon will still
# put in results for non Kindle books (author pages). So we need
# to explicitly check if the item is a Kindle book and ignore it
# if it isn't.
format = ''.join(data.xpath(format_xpath))
if 'kindle' not in format.lower():
continue
# We must have an asin otherwise we can't easily reference the
# book later.
asin = ''.join(data.xpath("@name"))
cover_url = ''.join(data.xpath(cover_xpath))
title = ''.join(data.xpath('.//a[@class="title"]/text()'))
price = ''.join(data.xpath('.//div[@class="newPrice"]/span[contains(@class, "price")]/text()'))
author = unicode(''.join(data.xpath('.//h3[@class="title"]/span[@class="ptBrand"]/text()')))
if author.startswith('de '):
author = author[3:]
counter -= 1
s = SearchResult()
s.cover_url = cover_url.strip()
s.title = title.strip()
s.author = author.strip()
s.price = price.strip()
s.detail_item = asin.strip()
s.formats = 'Kindle'
s.drm = SearchResult.DRM_UNKNOWN
yield s

View File

@ -6,78 +6,17 @@ __license__ = 'GPL 3'
__copyright__ = '2011, John Schember <john@nachtimwald.com>'
__docformat__ = 'restructuredtext en'
from contextlib import closing
from calibre.gui2.store.stores.amazon_uk_plugin import AmazonUKKindleStore
from lxml import html
from PyQt4.Qt import QUrl
from calibre import browser
from calibre.gui2 import open_url
from calibre.gui2.store import StorePlugin
from calibre.gui2.store.search_result import SearchResult
class AmazonITKindleStore(StorePlugin):
class AmazonITKindleStore(AmazonUKKindleStore):
'''
For comments on the implementation, please see amazon_plugin.py
'''
def open(self, parent=None, detail_item=None, external=False):
aff_id = {'tag': 'httpcharles07-21'}
store_link = 'http://www.amazon.it/ebooks-kindle/b?_encoding=UTF8&node=827182031&tag=%(tag)s&ie=UTF8&linkCode=ur2&camp=3370&creative=23322' % aff_id
if detail_item:
aff_id['asin'] = detail_item
store_link = 'http://www.amazon.it/gp/redirect.html?ie=UTF8&location=http://www.amazon.it/dp/%(asin)s&tag=%(tag)s&linkCode=ur2&camp=3370&creative=23322' % aff_id
open_url(QUrl(store_link))
def search(self, query, max_results=10, timeout=60):
search_url = 'http://www.amazon.it/s/?url=search-alias%3Ddigital-text&field-keywords='
url = search_url + query.encode('ascii', 'backslashreplace').replace('%', '%25').replace('\\x', '%').replace(' ', '+')
br = browser()
counter = max_results
with closing(br.open(url, timeout=timeout)) as f:
# doc = html.fromstring(f.read().decode('latin-1', 'replace'))
# Apparently amazon Europe is responding in UTF-8 now
doc = html.fromstring(f.read())
data_xpath = '//div[contains(@class, "result") and contains(@class, "product")]'
format_xpath = './/span[@class="format"]/text()'
cover_xpath = './/img[@class="productImage"]/@src'
for data in doc.xpath(data_xpath):
if counter <= 0:
break
# Even though we are searching digital-text only Amazon will still
# put in results for non Kindle books (author pages). So we need
# to explicitly check if the item is a Kindle book and ignore it
# if it isn't.
format = ''.join(data.xpath(format_xpath))
if 'kindle' not in format.lower():
continue
# We must have an asin otherwise we can't easily reference the
# book later.
asin = ''.join(data.xpath("@name"))
cover_url = ''.join(data.xpath(cover_xpath))
title = ''.join(data.xpath('.//a[@class="title"]/text()'))
price = ''.join(data.xpath('.//div[@class="newPrice"]/span[contains(@class, "price")]/text()'))
author = unicode(''.join(data.xpath('.//h3[@class="title"]/span[@class="ptBrand"]/text()')))
if author.startswith('di '):
author = author[3:]
counter -= 1
s = SearchResult()
s.cover_url = cover_url.strip()
s.title = title.strip()
s.author = author.strip()
s.price = price.strip()
s.detail_item = asin.strip()
s.formats = 'Kindle'
s.drm = SearchResult.DRM_UNKNOWN
yield s
aff_id = {'tag': 'httpcharles07-21'}
store_link = ('http://www.amazon.it/ebooks-kindle/b?_encoding=UTF8&'
'node=827182031&tag=%(tag)s&ie=UTF8&linkCode=ur2&camp=3370&creative=23322')
store_link_details = ('http://www.amazon.it/gp/redirect.html?ie=UTF8&'
'location=http://www.amazon.it/dp/%(asin)s&tag=%(tag)s&'
'linkCode=ur2&camp=3370&creative=23322')
search_url = 'http://www.amazon.it/s/?url=search-alias%3Ddigital-text&field-keywords='

View File

@ -127,35 +127,14 @@ class AmazonKindleStore(StorePlugin):
counter = max_results
with closing(br.open(url, timeout=timeout)) as f:
doc = html.fromstring(f.read().decode('latin-1', 'replace'))
# Amazon has two results pages.
is_shot = doc.xpath('boolean(//div[@id="shotgunMainResults"])')
# Horizontal grid of books. Search "Paolo Bacigalupi"
if is_shot:
data_xpath = '//div[contains(@class, "result")]'
format_xpath = './/div[@class="productTitle"]//text()'
asin_xpath = './/div[@class="productTitle"]//a'
cover_xpath = './/div[@class="productTitle"]//img/@src'
title_xpath = './/div[@class="productTitle"]/a//text()'
price_xpath = './/div[@class="newPrice"]/span/text()'
# Vertical list of books.
else:
# New style list. Search "Paolo Bacigalupi"
if doc.xpath('boolean(//div[@class="image"])'):
data_xpath = '//div[contains(@class, "results")]//div[contains(@class, "result")]'
format_xpath = './/span[@class="binding"]//text()'
asin_xpath = './/div[@class="image"]/a[1]'
cover_xpath = './/img[@class="productImage"]/@src'
title_xpath = './/a[@class="title"]/text()'
price_xpath = './/span[contains(@class, "price")]/text()'
# Old style list. Search "martin"
else:
data_xpath = '//div[contains(@class, "result")]'
format_xpath = './/span[@class="format"]//text()'
asin_xpath = './/div[@class="productImage"]/a[1]'
cover_xpath = './/div[@class="productImage"]//img/@src'
title_xpath = './/div[@class="productTitle"]/a/text()'
price_xpath = './/div[@class="newPrice"]//span//text()'
data_xpath = '//div[contains(@class, "prod")]'
format_xpath = './/ul[contains(@class, "rsltL")]//span[contains(@class, "lrg") and not(contains(@class, "bld"))]/text()'
asin_xpath = './/div[@class="image"]/a[1]'
cover_xpath = './/img[@class="productImage"]/@src'
title_xpath = './/h3[@class="newaps"]/a//text()'
author_xpath = './/h3[@class="newaps"]//span[contains(@class, "reg")]/text()'
price_xpath = './/ul[contains(@class, "rsltL")]//span[contains(@class, "lrg") and contains(@class, "bld")]/text()'
for data in doc.xpath(data_xpath):
if counter <= 0:
@ -186,14 +165,14 @@ class AmazonKindleStore(StorePlugin):
cover_url = ''.join(data.xpath(cover_xpath))
title = ''.join(data.xpath(title_xpath))
author = ''.join(data.xpath(author_xpath))
try:
author = author.split('by ', 1)[1].split(" (")[0]
except:
pass
price = ''.join(data.xpath(price_xpath))
if is_shot:
author = format.split(' by ')[-1]
else:
author = ''.join(data.xpath('.//span[@class="ptBrand"]/text()'))
author = author.split('by ')[-1]
counter -= 1
s = SearchResult()

View File

@ -6,8 +6,9 @@ __license__ = 'GPL 3'
__copyright__ = '2011, John Schember <john@nachtimwald.com>'
__docformat__ = 'restructuredtext en'
from contextlib import closing
import re
from contextlib import closing
from lxml import html
from PyQt4.Qt import QUrl
@ -18,57 +19,80 @@ from calibre.gui2.store import StorePlugin
from calibre.gui2.store.search_result import SearchResult
class AmazonUKKindleStore(StorePlugin):
aff_id = {'tag': 'calcharles-21'}
store_link = ('http://www.amazon.co.uk/gp/redirect.html?ie=UTF8&'
'location=http://www.amazon.co.uk/Kindle-eBooks/b?'
'ie=UTF8&node=341689031&ref_=sa_menu_kbo2&tag=%(tag)s&'
'linkCode=ur2&camp=1634&creative=19450')
store_link_details = ('http://www.amazon.co.uk/gp/redirect.html?ie=UTF8&'
'location=http://www.amazon.co.uk/dp/%(asin)s&tag=%(tag)s&'
'linkCode=ur2&camp=1634&creative=6738')
search_url = 'http://www.amazon.co.uk/s/?url=search-alias%3Ddigital-text&field-keywords='
'''
For comments on the implementation, please see amazon_plugin.py
'''
def open(self, parent=None, detail_item=None, external=False):
aff_id = {'tag': 'calcharles-21'}
store_link = 'http://www.amazon.co.uk/gp/redirect.html?ie=UTF8&location=http://www.amazon.co.uk/Kindle-eBooks/b?ie=UTF8&node=341689031&ref_=sa_menu_kbo2&tag=%(tag)s&linkCode=ur2&camp=1634&creative=19450' % aff_id
store_link = self.store_link % self.aff_id
if detail_item:
aff_id['asin'] = detail_item
store_link = 'http://www.amazon.co.uk/gp/redirect.html?ie=UTF8&location=http://www.amazon.co.uk/dp/%(asin)s&tag=%(tag)s&linkCode=ur2&camp=1634&creative=6738' % aff_id
self.aff_id['asin'] = detail_item
store_link = self.store_link_details % self.aff_id
open_url(QUrl(store_link))
def search(self, query, max_results=10, timeout=60):
search_url = 'http://www.amazon.co.uk/s/?url=search-alias%3Ddigital-text&field-keywords='
url = search_url + query.encode('ascii', 'backslashreplace').replace('%', '%25').replace('\\x', '%').replace(' ', '+')
url = self.search_url + query.encode('ascii', 'backslashreplace').replace('%', '%25').replace('\\x', '%').replace(' ', '+')
br = browser()
counter = max_results
with closing(br.open(url, timeout=timeout)) as f:
# Apparently amazon Europe is responding in UTF-8 now
doc = html.fromstring(f.read())
doc = html.fromstring(f.read())#.decode('latin-1', 'replace'))
data_xpath = '//div[contains(@class, "result") and contains(@class, "product")]'
format_xpath = './/span[@class="format"]/text()'
data_xpath = '//div[contains(@class, "prod")]'
format_xpath = './/ul[contains(@class, "rsltL")]//span[contains(@class, "lrg") and not(contains(@class, "bld"))]/text()'
asin_xpath = './/div[@class="image"]/a[1]'
cover_xpath = './/img[@class="productImage"]/@src'
title_xpath = './/h3[@class="newaps"]/a//text()'
author_xpath = './/h3[@class="newaps"]//span[contains(@class, "reg")]/text()'
price_xpath = './/ul[contains(@class, "rsltL")]//span[contains(@class, "lrg") and contains(@class, "bld")]/text()'
for data in doc.xpath(data_xpath):
if counter <= 0:
break
# Even though we are searching digital-text only Amazon will still
# put in results for non Kindle books (author pages). So we need
# put in results for non Kindle books (author pages). Se we need
# to explicitly check if the item is a Kindle book and ignore it
# if it isn't.
format = ''.join(data.xpath(format_xpath))
if 'kindle' not in format.lower():
format_ = ''.join(data.xpath(format_xpath))
if 'kindle' not in format_.lower():
continue
# We must have an asin otherwise we can't easily reference the
# book later.
asin = ''.join(data.xpath("@name"))
asin_href = None
asin_a = data.xpath(asin_xpath)
if asin_a:
asin_href = asin_a[0].get('href', '')
m = re.search(r'/dp/(?P<asin>.+?)(/|$)', asin_href)
if m:
asin = m.group('asin')
else:
continue
else:
continue
cover_url = ''.join(data.xpath(cover_xpath))
title = ''.join(data.xpath('.//a[@class="title"]/text()'))
price = ''.join(data.xpath('.//div[@class="newPrice"]/span[contains(@class, "price")]/text()'))
title = ''.join(data.xpath(title_xpath))
author = ''.join(data.xpath(author_xpath))
try:
author = author.split('by ', 1)[1].split(" (")[0]
except:
pass
author = ''.join(data.xpath('.//h3[@class="title"]/span[@class="ptBrand"]/text()'))
if author.startswith('by '):
author = author[3:]
price = ''.join(data.xpath(price_xpath))
counter -= 1
@ -78,37 +102,10 @@ class AmazonUKKindleStore(StorePlugin):
s.author = author.strip()
s.price = price.strip()
s.detail_item = asin.strip()
s.drm = SearchResult.DRM_UNKNOWN
s.formats = 'Kindle'
yield s
def get_details(self, search_result, timeout):
# We might already have been called.
if search_result.drm:
return
url = 'http://amazon.co.uk/dp/'
drm_search_text = u'Simultaneous Device Usage'
drm_free_text = u'Unlimited'
br = browser()
with closing(br.open(url + search_result.detail_item, timeout=timeout)) as nf:
idata = html.fromstring(nf.read())
if not search_result.author:
search_result.author = ''.join(idata.xpath('//div[@class="buying" and contains(., "Author")]/a/text()'))
is_kindle = idata.xpath('boolean(//div[@class="buying"]/h1/span/span[contains(text(), "Kindle Edition")])')
if is_kindle:
search_result.formats = 'Kindle'
if idata.xpath('boolean(//div[@class="content"]//li/b[contains(text(), "' +
drm_search_text + '")])'):
if idata.xpath('boolean(//div[@class="content"]//li[contains(., "' +
drm_free_text + '") and contains(b, "' +
drm_search_text + '")])'):
search_result.drm = SearchResult.DRM_UNLOCKED
else:
search_result.drm = SearchResult.DRM_UNKNOWN
else:
search_result.drm = SearchResult.DRM_LOCKED
return True
pass

View File

@ -6,7 +6,6 @@ __license__ = 'GPL 3'
__copyright__ = '2011, John Schember <john@nachtimwald.com>'
__docformat__ = 'restructuredtext en'
import random
import re
import urllib2
from contextlib import closing
@ -25,23 +24,12 @@ from calibre.gui2.store.web_store_dialog import WebStoreDialog
class EHarlequinStore(BasicStoreConfig, StorePlugin):
def open(self, parent=None, detail_item=None, external=False):
m_url = 'http://www.dpbolvw.net/'
h_click = 'click-4879827-534091'
d_click = 'click-4879827-10375439'
# Use Kovid's affiliate id 30% of the time.
if random.randint(1, 10) in (1, 2, 3):
h_click = 'click-4913808-534091'
d_click = 'click-4913808-10375439'
url = m_url + h_click
detail_url = None
if detail_item:
detail_url = m_url + d_click + detail_item
url = 'http://www.harlequin.com/'
if external or self.config.get('open_external', False):
open_url(QUrl(url_slash_cleaner(detail_url if detail_url else url)))
open_url(QUrl(url_slash_cleaner(detail_item if detail_item else url)))
else:
d = WebStoreDialog(self.gui, url, parent, detail_url)
d = WebStoreDialog(self.gui, url, parent, detail_item)
d.setWindowTitle(self.name)
d.set_tags(self.config.get('tags', ''))
d.exec_()
@ -74,7 +62,7 @@ class EHarlequinStore(BasicStoreConfig, StorePlugin):
s.title = title.strip()
s.author = author.strip()
s.price = price.strip()
s.detail_item = '?url=http://ebooks.eharlequin.com/' + id.strip()
s.detail_item = 'http://ebooks.eharlequin.com/' + id.strip()
s.formats = 'EPUB'
yield s

View File

@ -68,10 +68,10 @@ class GoogleBooksStore(BasicStoreConfig, StorePlugin):
continue
title = ''.join(data.xpath('.//h3/a//text()'))
authors = data.xpath('.//span[@class="f"]//a//text()')
if authors and authors[-1].strip().lower() in ('preview', 'read'):
authors = data.xpath('.//div[@class="f"]//a//text()')
while authors and authors[-1].strip().lower() in ('preview', 'read', 'more editions'):
authors = authors[:-1]
else:
if not authors:
continue
author = ', '.join(authors)

View File

@ -7,6 +7,7 @@ __copyright__ = '2011, John Schember <john@nachtimwald.com>'
__docformat__ = 'restructuredtext en'
import random
import urllib
import urllib2
from contextlib import closing
@ -24,23 +25,24 @@ from calibre.gui2.store.web_store_dialog import WebStoreDialog
class KoboStore(BasicStoreConfig, StorePlugin):
def open(self, parent=None, detail_item=None, external=False):
m_url = 'http://www.dpbolvw.net/'
h_click = 'click-4879827-10762497'
d_click = 'click-4879827-10772898'
pub_id = 'sHa5EXvYOwA'
# Use Kovid's affiliate id 30% of the time.
if random.randint(1, 10) in (1, 2, 3):
h_click = 'click-4913808-10762497'
d_click = 'click-4913808-10772898'
pub_id = '0dsO3kDu/AU'
murl = 'http://click.linksynergy.com/fs-bin/click?id=%s&offerid=268429.4&type=3&subid=0' % pub_id
url = m_url + h_click
detail_url = None
if detail_item:
detail_url = m_url + d_click + detail_item
purl = 'http://click.linksynergy.com/link?id=%s&offerid=268429&type=2&murl=%s' % (pub_id, urllib.quote_plus(detail_item))
url = purl
else:
purl = None
url = murl
if external or self.config.get('open_external', False):
open_url(QUrl(url_slash_cleaner(detail_url if detail_url else url)))
open_url(QUrl(url_slash_cleaner(url)))
else:
d = WebStoreDialog(self.gui, url, parent, detail_url)
d = WebStoreDialog(self.gui, murl, parent, purl)
d.setWindowTitle(self.name)
d.set_tags(self.config.get('tags', ''))
d.exec_()
@ -60,15 +62,19 @@ class KoboStore(BasicStoreConfig, StorePlugin):
id = ''.join(data.xpath('.//div[@class="SearchImageContainer"]/a[1]/@href'))
if not id:
continue
try:
id = id.split('?', 1)[0]
except:
continue
price = ''.join(data.xpath('.//span[@class="OurPrice"]/strong/text()'))
price = ''.join(data.xpath('.//span[@class="KV2OurPrice"]/strong/text()'))
if not price:
price = '$0.00'
cover_url = ''.join(data.xpath('.//div[@class="SearchImageContainer"]//img[1]/@src'))
title = ''.join(data.xpath('.//div[@class="SCItemHeader"]/h1/a[1]/text()'))
author = ', '.join(data.xpath('.//div[@class="SCItemSummary"]//span//a/text()'))
title = ''.join(data.xpath('.//div[@class="SCItemHeader"]//a[1]/text()'))
author = ', '.join(data.xpath('.//div[@class="SCItemSummary"]//span[contains(@class, "Author")]//a/text()'))
drm = data.xpath('boolean(.//span[@class="SCAvailibilityFormatsText" and not(contains(text(), "DRM-Free"))])')
counter -= 1
@ -78,7 +84,7 @@ class KoboStore(BasicStoreConfig, StorePlugin):
s.title = title.strip()
s.author = author.strip()
s.price = price.strip()
s.detail_item = '?url=http://www.kobobooks.com/' + id.strip()
s.detail_item = 'http://www.kobobooks.com/' + id.strip()
s.drm = SearchResult.DRM_LOCKED if drm else SearchResult.DRM_UNLOCKED
s.formats = 'EPUB'

View File

@ -25,7 +25,7 @@ class LibreDEStore(BasicStoreConfig, StorePlugin):
def open(self, parent=None, detail_item=None, external=False):
url = 'http://ad.zanox.com/ppc/?18817073C15644254T'
url_details = ('http://ad.zanox.com/ppc/?18817073C15644254T&ULP=[['
'http://www.libri.de/shop/action/productDetails?artiId={0}]]')
'http://www.ebook.de/shop/action/productDetails?artiId={0}]]')
if external or self.config.get('open_external', False):
if detail_item:
@ -41,33 +41,38 @@ class LibreDEStore(BasicStoreConfig, StorePlugin):
d.exec_()
def search(self, query, max_results=10, timeout=60):
url = ('http://www.libri.de/shop/action/quickSearch?facetNodeId=6'
'&mainsearchSubmit=Los!&searchString=' + urllib2.quote(query))
url = ('http://www.ebook.de/de/pathSearch?nav=52122&searchString='
+ urllib2.quote(query))
br = browser()
counter = max_results
with closing(br.open(url, timeout=timeout)) as f:
doc = html.fromstring(f.read())
for data in doc.xpath('//div[contains(@class, "item")]'):
for data in doc.xpath('//div[contains(@class, "articlecontainer")]'):
if counter <= 0:
break
details = data.xpath('./div[@class="beschreibungContainer"]')
details = data.xpath('./div[@class="articleinfobox"]')
if not details:
continue
details = details[0]
id = ''.join(details.xpath('./div[@class="text"]/a/@name')).strip()
if not id:
id_ = ''.join(details.xpath('./a/@name')).strip()
if not id_:
continue
cover_url = ''.join(details.xpath('.//div[@class="coverImg"]/a/img/@src'))
title = ''.join(details.xpath('./div[@class="text"]/span[@class="titel"]/a/text()')).strip()
author = ''.join(details.xpath('./div[@class="text"]/span[@class="author"]/text()')).strip()
title = ''.join(details.xpath('.//a[@class="su1_c_l_titel"]/text()')).strip()
author = ''.join(details.xpath('.//div[@class="author"]/text()')).strip()
if author.startswith('von'):
author = author[4:]
pdf = details.xpath(
'boolean(.//span[@class="format" and contains(text(), "pdf")]/text())')
'boolean(.//span[@class="bindername" and contains(text(), "pdf")]/text())')
epub = details.xpath(
'boolean(.//span[@class="format" and contains(text(), "epub")]/text())')
'boolean(.//span[@class="bindername" and contains(text(), "epub")]/text())')
mobi = details.xpath(
'boolean(.//span[@class="format" and contains(text(), "mobipocket")]/text())')
'boolean(.//span[@class="bindername" and contains(text(), "mobipocket")]/text())')
cover_url = ''.join(data.xpath('.//div[@class="coverImg"]/a/img/@src'))
price = ''.join(data.xpath('.//span[@class="preis"]/text()')).replace('*', '').strip()
counter -= 1
@ -78,7 +83,7 @@ class LibreDEStore(BasicStoreConfig, StorePlugin):
s.author = author.strip()
s.price = price
s.drm = SearchResult.DRM_UNKNOWN
s.detail_item = id
s.detail_item = id_
formats = []
if epub:
formats.append('ePub')

Some files were not shown because too many files have changed in this diff Show More