GwR integration of epub/mobi

This commit is contained in:
GRiker 2010-01-22 11:03:04 -07:00
commit 64611c1b0d
26 changed files with 10099 additions and 7324 deletions

View File

@ -4,6 +4,131 @@
# for important features/bug fixes. # for important features/bug fixes.
# Also, each release can have new and improved recipes. # Also, each release can have new and improved recipes.
- version: 0.6.35
date: 2010-01-22
new features:
- title: Catalog generation
type: major
description: >
"You can now easily generate a catlog of all books in your calibre library by clicking the arrow next to the convert button. The catalog can be in one of several formats: XML, CSV, EPUB and MOBI, with scope for future formats via plugins. If you generate the catalog in an e-book format, it will be automatically sent to your e-book reader the next time you connect it, allowing you to easily browse your collection on the reader itself."
- title: "RTF Input: Support for unicode characters. Needs testing."
type: major
tickets: [4501]
- title: "Add Quick Start Guide by John Schember to calibre library on first run of calibre"
type: major
- title: "Improve handling of justification"
description: >
"Now calibre will explicitly change the justification of all left aligned paragraphs to justified or vice versa depending on the justification setting. This should make it possible to robustly convert all content to either justified or not. calibre will not touch centered or right aligned content."
- title: "E-book viewer: Fit images to viewer window (can be turned off via Preferences)"
- title: "Add section on E-book viewer to User Manual"
- title: "Development environment: First look for resources in the location pointed to by CALIBRE_DEVELOP_FROM. If not found, use the normal resource location"
- title: "When reading metadata from filenames, with the Swap author names option checked, improve the logic used to detect author last name."
tickets: [4620]
- title: "News downloads: When getting an article URL from a RSS feed, look first for an original article link. This speeds up the download of news services that use a syndication service like feedburner or pheedo to publish their RSS feeds."
bug fixes:
- "Windows device detection: Don't do expensive polling while waiting for device disconnect. This should fix the problems people have with their floppy drive being activated while an e-book reader is connected"
- title: "PML Input: Fix creation of metadata Table of Contents"
tickets: [5633]
- title: "Fix Tag browser not updating after using delete specific format actions"
tickets: [4632]
- title: "MOBI Output: Don't die when converting EPUB files with SVG covers"
- title: "Nook driver: Remove the # character from filenames when sending to device"
tickets: [4629]
- title: "Workaround for bug in QtWebKit on windows that could cause crashes when using the next page button in the e-book viewer for certain files"
tickets: [4606]
- title: "MOBI Input: Rescale img width and height attributes that were specified in em units"
tickets: [4608]
- title: "ebook-meta: Fix setting of series metadata"
- title: "RTF metadata: Fix reading metadata from very small files"
- title: "Conversion pipeline: Don't error out if the user sets an invalid chapter detection XPath"
- title: "Fix main mem and card being swapped in pocketbook detection on OS X"
- title: "Welcome wizard: Set the language to english if the user doesn't explicitly change the language. This ensures that the language will be english on windows by default"
- title: "Fix bug in OEBWriter that could cause writing out of resources in subdirectories with URL unsafe names to fail"
new recipes:
- title: Frankfurter Rundschau
author: Justus Bisser
- title: The Columbia Hournalism Review
author: XanthanGum
- title: Various CanWest Canadian news sources
author: Nick Redding
- title: gigitaljournal.com
author: Darko Miletic
- title: Pajamas Media
autor: Krittika Goyal
- title: Algemeen Dagbla
author: kwetal
- title: "The Reader's Digest"
author: BrianG
- title: The Yemen Times
author: kwetal
- title: The Kitsap Sun
author: Darko Miletic
- title: drivelry.com
author: Krittika Goyal
- title: New recipe for Google Reader that downloads unread articles instead of just starred ones
author: rollercoaster
- title: Le Devoir
author: Lorenzo Vigentini
- title: Joop
author: kwetal
- title: Various computer magazines
author: Lorenzo Vigentini
- title: The Wall Street journal (free parts)
author: Nick Redding
- title: Journal of Nephrology
author: Krittika Goyal
- title: stuff.co.nz
author: Krittika Goyal
improved recipes:
- Physics Today
- Wall Street Journal
- American Spectator
- FTD
- The National Post
- Blic
- version: 0.6.34 - version: 0.6.34
date: 2010-01-15 date: 2010-01-15

View File

@ -0,0 +1,67 @@
__license__ = 'GPL v3'
__copyright__ = '2009, Justus Bisser <justus.bisser at gmail.com>'
'''
fr-online.de
'''
import re
from calibre.web.feeds.news import BasicNewsRecipe
class Spiegel_ger(BasicNewsRecipe):
title = 'Frankfurter Rundschau'
__author__ = 'Justus Bisser'
description = "Dies ist die Online-Ausgabe der Frankfurter Rundschau. Um die abgerufenen individuell einzustellen bearbeiten sie die Liste im erweiterten Modus. Die Feeds findet man auf http://www.fr-online.de/verlagsservice/fr_newsreader/?em_cnt=574255"
publisher = 'Druck- und Verlagshaus Frankfurt am Main GmbH'
category = 'FR Online, Frankfurter Rundschau, Nachrichten, News,Dienste, RSS, RSS, Feedreader, Newsfeed, iGoogle, Netvibes, Widget'
oldest_article = 7
max_articles_per_feed = 100
language = 'de'
lang = 'de-DE'
no_stylesheets = True
use_embedded_content = False
#encoding = 'cp1252'
conversion_options = {
'comment' : description
, 'tags' : category
, 'publisher' : publisher
, 'language' : lang
}
recursions = 0
max_articles_per_feed = 100
#keep_only_tags = [dict(name='div', attrs={'class':'text'})]
#tags_remove = [dict(name='div', attrs={'style':'text-align: left; margin: 4px 0px 0px 4px; width: 200px; float: right;'})]
remove_attributes = ['style']
feeds = []
#remove_tags_before = [dict(name='div', attrs={'style':'padding-left: 0px;'})]
#remove_tags_after = [dict(name='div', attrs={'class':'box_head_text'})]
# enable for all news
allNews = 0
if allNews:
feeds = [(u'Frankfurter Rundschau', u'http://www.fr-online.de/rss/sport/index.xml')]
else:
#select the feeds you like
feeds = [(u'Nachrichten', u'http://www.fr-online.de/rss/politik/index.xml')]
feeds.append((u'Kommentare und Analysen', u'http://www.fr-online.de/rss/meinung/index.xml'))
feeds.append((u'Dokumentationen', u'http://www.fr-online.de/rss/dokumentation/index.xml'))
feeds.append((u'Deutschlandtrend', u'http://www.fr-online.de/rss/deutschlandtrend/index.xml'))
feeds.append((u'Wirtschaft', u'http://www.fr-online.de/rss/wirtschaft/index.xml'))
feeds.append((u'Sport', u'http://www.fr-online.de/rss/sport/index.xml'))
feeds.append((u'Feuilleton', u'http://www.fr-online.de/rss/feuilleton/index.xml'))
feeds.append((u'Panorama', u'http://www.fr-online.de/rss/panorama/index.xml'))
feeds.append((u'Rhein Main und Hessen', u'http://www.fr-online.de/rss/hessen/index.xml'))
feeds.append((u'Fitness und Gesundheit', u'http://www.fr-online.de/rss/fit/index.xml'))
feeds.append((u'Multimedia', u'http://www.fr-online.de/rss/multimedia/index.xml'))
feeds.append((u'Wissen und Bildung', u'http://www.fr-online.de/rss/wissen/index.xml'))
def get_article_url(self, article):
url = article.link
regex = re.compile("0C[0-9]{6,8}0A?")
liste = regex.findall(url)
string = liste.pop(0)
string = string[2:len(string)-1]
return "http://www.fr-online.de/_em_cms/_globals/print.php?em_cnt=" + string

View File

@ -9,16 +9,16 @@ from calibre.web.feeds.news import BasicNewsRecipe
class FTDe(BasicNewsRecipe): class FTDe(BasicNewsRecipe):
title = 'FTD' title = 'FTD'
description = 'Financial Times Deutschland' description = 'Financial Times Deutschland'
__author__ = 'Oliver Niesner' __author__ = 'Oliver Niesner'
use_embedded_content = False use_embedded_content = False
timefmt = ' [%d %b %Y]' timefmt = ' [%d %b %Y]'
language = _('German') language = 'de'
max_articles_per_feed = 40 max_articles_per_feed = 40
no_stylesheets = True no_stylesheets = True
remove_tags = [dict(id='navi_top'), remove_tags = [dict(id='navi_top'),
dict(id='topbanner'), dict(id='topbanner'),
dict(id='seitenkopf'), dict(id='seitenkopf'),
@ -83,8 +83,8 @@ class FTDe(BasicNewsRecipe):
dict(name='div', attrs={'class':'articleOptionFootFrame'}), dict(name='div', attrs={'class':'articleOptionFootFrame'}),
dict(name='div', attrs={'class':'artikelsplitfaq'})] dict(name='div', attrs={'class':'artikelsplitfaq'})]
#remove_tags_after = [dict(name='a', attrs={'class':'more'})] #remove_tags_after = [dict(name='a', attrs={'class':'more'})]
feeds = [ ('Finanzen', 'http://www.ftd.de/rss2/finanzen/maerkte'), feeds = [ ('Finanzen', 'http://www.ftd.de/rss2/finanzen/maerkte'),
('Meinungshungrige', 'http://www.ftd.de/rss2/meinungshungrige'), ('Meinungshungrige', 'http://www.ftd.de/rss2/meinungshungrige'),
('Unternehmen', 'http://www.ftd.de/rss2/unternehmen'), ('Unternehmen', 'http://www.ftd.de/rss2/unternehmen'),
('Politik', 'http://www.ftd.de/rss2/politik'), ('Politik', 'http://www.ftd.de/rss2/politik'),
@ -95,8 +95,8 @@ class FTDe(BasicNewsRecipe):
('Auto', 'http://www.ftd.de/rss2/auto'), ('Auto', 'http://www.ftd.de/rss2/auto'),
('Lifestyle', 'http://www.ftd.de/rss2/lifestyle') ('Lifestyle', 'http://www.ftd.de/rss2/lifestyle')
] ]
def print_version(self, url): def print_version(self, url):
return url.replace('.html', '.html?mode=print') return url.replace('.html', '.html?mode=print')

View File

@ -8,6 +8,7 @@ __license__ = 'GPL v3'
__copyright__ = '2009, John Schember <john@nachtimwald.com>' __copyright__ = '2009, John Schember <john@nachtimwald.com>'
__docformat__ = 'restructuredtext en' __docformat__ = 'restructuredtext en'
import os
import re import re
import StringIO import StringIO
@ -198,14 +199,26 @@ class PML_HTMLizer(object):
def start_line(self): def start_line(self):
start = u'' start = u''
div = []
span = []
other = []
for key, val in self.state.items(): for key, val in self.state.items():
if val[0]: if val[0]:
if key in self.STATES_VALUE_REQ: if key in self.DIV_STATES:
start += self.STATES_TAGS[key][0] % val[1] div.append((key, val[1]))
elif key in self.STATES_VALUE_REQ_2: elif key in self.SPAN_STATES:
start += self.STATES_TAGS[key][0] % (val[1], val[1]) span.append((key, val[1]))
else: else:
start += self.STATES_TAGS[key][0] other.append((key, val[1]))
for key, val in other+div+span:
if key in self.STATES_VALUE_REQ:
start += self.STATES_TAGS[key][0] % val
elif key in self.STATES_VALUE_REQ_2:
start += self.STATES_TAGS[key][0] % (val, val)
else:
start += self.STATES_TAGS[key][0]
return u'<p>%s' % start return u'<p>%s' % start
@ -518,7 +531,7 @@ class PML_HTMLizer(object):
elif c == 'C': elif c == 'C':
line.read(1) line.read(1)
id = 'pml_toc-%s' % len(self.toc) id = 'pml_toc-%s' % len(self.toc)
self.toc.add_item(self.file_name, id, self.code_value(line)) self.toc.add_item(os.path.basename(self.file_name), id, self.code_value(line))
text = '<span id="%s"></span>' % id text = '<span id="%s"></span>' % id
elif c == 'n': elif c == 'n':
pass pass

View File

@ -9,14 +9,22 @@ __docformat__ = 'restructuredtext en'
from calibre.gui2 import gprefs from calibre.gui2 import gprefs
from catalog_epub_mobi_ui import Ui_Form from catalog_epub_mobi_ui import Ui_Form
from calibre.ebooks.conversion.config import load_defaults
from PyQt4.Qt import QWidget from PyQt4.Qt import QWidget
class PluginWidget(QWidget,Ui_Form): class PluginWidget(QWidget,Ui_Form):
TITLE = _('EPUB/MOBI Options') TITLE = _('EPUB/MOBI Options')
HELP = _('Options specific to')+' EPUB/MOBI '+_('output') HELP = _('Options specific to')+' EPUB/MOBI '+_('output')
# Indicates whether this plugin wants its output synced to the connected device OPTION_FIELDS = [('exclude_genre','\[[\w ]*\]'),
('exclude_tags','~'),
('read_tag','+'),
('note_tag','*')]
# Output synced to the connected device?
sync_enabled = True sync_enabled = True
# Formats supported by this plugin
formats = set(['epub','mobi']) formats = set(['epub','mobi'])
def __init__(self, parent=None): def __init__(self, parent=None):
@ -26,20 +34,25 @@ class PluginWidget(QWidget,Ui_Form):
def initialize(self, name): def initialize(self, name):
self.name = name self.name = name
# Restore options from last use here # Restore options from last use here
print "gui2.catalog.catalog_epub_mobi:initialize(): need to restore options" print "gui2.catalog.catalog_epub_mobi:initialize(): Retrieving options"
for opt in self.OPTION_FIELDS:
def options(self): opt_value = gprefs[self.name + '_' + opt[0]]
OPTION_FIELDS = ['exclude_genre','exclude_tags','read_tag','note_tag','output_profile'] print "Restoring %s: %s" % (self.name + '_' + opt[0], opt_value)
setattr(self,opt[0], unicode(opt_value))
# Save the current options def options(self):
print "gui2.catalog.catalog_epub_mobi:options(): need to save options"
# Save/return the current options
# Return a dictionary with current options # getattr() returns text value of QLineEdit control
print "gui2.catalog.catalog_epub_mobi:options(): need to return options" print "gui2.catalog.catalog_epub_mobi:options(): Saving options"
print "gui2.catalog.catalog_epub_mobi:options(): using hard-coded options"
opts_dict = {} opts_dict = {}
for opt in OPTION_FIELDS: for opt in self.OPTION_FIELDS:
opts_dict[opt] = str(getattr(self,opt).text()).split(',') opt_value = unicode(getattr(self,opt[0]))
print "writing %s to gprefs" % opt_value
gprefs.set(self.name + '_' + opt[0], opt_value)
opts_dict[opt[0]] = opt_value.split(',')
opts_dict['output_profile'] = [load_defaults('page_setup')['output_profile']]
return opts_dict return opts_dict

View File

@ -58,19 +58,6 @@
<string>Additional note tag prefix:</string> <string>Additional note tag prefix:</string>
</property> </property>
</widget> </widget>
<widget class="QLabel" name="label_5">
<property name="geometry">
<rect>
<x>20</x>
<y>140</y>
<width>181</width>
<height>17</height>
</rect>
</property>
<property name="text">
<string>Output profile:</string>
</property>
</widget>
<widget class="QLineEdit" name="exclude_genre"> <widget class="QLineEdit" name="exclude_genre">
<property name="geometry"> <property name="geometry">
<rect> <rect>
@ -148,22 +135,6 @@
<string>*</string> <string>*</string>
</property> </property>
</widget> </widget>
<widget class="QLineEdit" name="output_profile">
<property name="geometry">
<rect>
<x>300</x>
<y>140</y>
<width>231</width>
<height>22</height>
</rect>
</property>
<property name="toolTip">
<string extracomment="Tooltip comment here"/>
</property>
<property name="text">
<string>kindle2</string>
</property>
</widget>
</widget> </widget>
<resources/> <resources/>
<connections/> <connections/>

View File

@ -123,14 +123,14 @@ class Catalog(QDialog, Ui_Dialog):
if self.sync.isEnabled(): if self.sync.isEnabled():
self.sync.setChecked(dynamic.get('catalog_sync_to_device', True)) self.sync.setChecked(dynamic.get('catalog_sync_to_device', True))
self.format.currentIndexChanged.connect(self.format_changed) self.format.currentIndexChanged.connect(self.show_plugin_tab)
self.show_plugin_tab(None) self.show_plugin_tab(None)
def show_plugin_tab(self, idx): def show_plugin_tab(self, idx):
cf = unicode(self.format.currentText()).lower() cf = unicode(self.format.currentText()).lower()
while self.tabs.count() > 1: while self.tabs.count() > 1:
self.tabs.remove(1) self.tabs.removeTab(1)
for pw in self.widgets: for pw in self.widgets:
if cf in pw.formats: if cf in pw.formats:
self.tabs.addTab(pw, pw.TITLE) self.tabs.addTab(pw, pw.TITLE)

View File

@ -267,7 +267,6 @@ class EPUB_MOBI(CatalogPlugin):
"Applies to: ePub, MOBI output formats")) "Applies to: ePub, MOBI output formats"))
] ]
class NumberToText(object): class NumberToText(object):
''' '''
Converts numbers to text Converts numbers to text

View File

@ -124,7 +124,7 @@ Convert e-books
~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~
.. |cei| image:: images/convert_ebooks.png .. |cei| image:: images/convert_ebooks.png
|cei| Ebooks can be converted from a number of formats into the LRF format (for the SONY Reader). Note that ebooks you purchase will typically have `Digital Rights Management <http://en.wikipedia.org/wiki/Digital_rights_management>`_ *(DRM)*. |app| will not convert these ebooks. For many DRM formats, it is easy to remove the DRM, but as this is illegal, you have to find tools to liberate your books yourself and then use |app| to convert them. |cei| Ebooks can be converted from a number of formats into the LRF format (for the SONY Reader). Note that ebooks you purchase will typically have `Digital Rights Management <http://bugs.calibre-ebook.com/wiki/DRM>`_ *(DRM)*. |app| will not convert these ebooks. For many DRM formats, it is easy to remove the DRM, but as this is illegal, you have to find tools to liberate your books yourself and then use |app| to convert them.
For most people, conversion should be a simple 1-click affair. But if you want to learn more about the conversion process, see :ref:`conversion`. For most people, conversion should be a simple 1-click affair. But if you want to learn more about the conversion process, see :ref:`conversion`.
@ -134,7 +134,7 @@ The :guilabel:`Convert E-books` action has three variations, accessed by the arr
2. **Bulk convert**: This allows you to specify options only once to convert a number of ebooks in bulk. 2. **Bulk convert**: This allows you to specify options only once to convert a number of ebooks in bulk.
3. **Set conversion defaults**: Allows you to set the default settings for future conversions. 3. **Create catalog**: This action allow yous to generate a complete listing with all metadata of the books in your library, in several formats, like XML, CSV, EPUB and MOBI. The catalog will contain all the books showing in the library view currently, so you can use the search features to limit the books to be catalogued. In addition, if you select multiple books using the mouse, only those books will be added to the catalog. If you generate the catalog in an e-book format such as EPUB or MOBI, the next time you connect your e-book reader, the catalog will be automatically sent to the device.
.. _view: .. _view:

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff