sync with Kovid's branch

This commit is contained in:
Tomasz Długosz 2013-05-23 23:44:42 +02:00
commit 5df48e1b4d
201 changed files with 94855 additions and 76144 deletions

View File

@ -20,6 +20,96 @@
# new recipes: # new recipes:
# - title: # - title:
- version: 0.9.31
date: 2013-05-17
new features:
- title: "Book list: Highlight the current cell in the book list, particularly convenient for usage with the keyboard."
- title: "Allow creation of advanced rules for column icons."
- title: "Driver for the limited edition SONY PRS-T2N"
- title: "MOBI Input: Add support for MOBI/KF8 files generated with the to be released kindlegen 2.9."
tickets: [1179144]
bug fixes:
- title: "ToC Editor: Fix incorrect playOrders in the generated toc.ncx when editing the toc in an epub file. This apparently affects FBReader."
- title: "PDF Input: Fix crashes on some malformed files, by updating the PDF library calibre uses (poppler 0.22.4)"
- title: "PDF Output: Ignore invalid links instead of erroring out on them."
tickets: [1179314]
- title: "MOBI Output: Fix space errorneously being removed when the input document contains a tag with leading space and sub-tags."
tickets: [1179216]
- title: "Search and replace wizard: Fix generated html being slightly different from the actual html in the conversion pipeline for some input formats (mainly HTML, CHM, LIT)."
improved recipes:
- Weblogs SL
- .net magazine
new recipes:
- title: nrc-next
author: Niels Giesen
- version: 0.9.30
date: 2013-05-10
new features:
- title: "Kobo driver: Add support for showing 'Archived' books on the device. Also up the supported firmware version to 2.5.3."
tickets: [1177677]
- title: "Driver for Blackberry 9790"
tickets: [1176607]
- title: "Add a tweak to turn off the highlighting of the book count when using a virtual library (Preferences->Tweaks)"
- title: "Add a button to clear the viewer search history in the viewer Preferences, under Miscellaneous"
- title: "Add keyboard shortcuts to clear the virtual Library and the additional restriction (Ctrl+Esc and Alt+Esc). Also use Shift+Esc to bring keyboard focus back tot he book list. Can be changed under Preferences->Keyboard"
- title: "Docx metadata: Read the language of the file, if present"
bug fixes:
- title: "Kobo driver: Fix unable to read SD card on OS X/Linux"
tickets: [1174815]
- title: "Content server: Fix unable to download ORIGINAL_* formats"
tickets: [1177158]
- title: "Fix regression that broke searching for terms containing a quote mark"
tickets: [1177114]
- title: "Fix regression that broke conversion of txt files when no input encoding is specified"
tickets: [1176622]
- title: "When changing to a virtual library, refresh the Book Details panel."
tickets: [1176296]
- title: "Fix regression that caused searching for user categories to break."
tickets: [1176187]
- title: "Fix error when downloading only covers and reviewing downloaded metadata."
tickets: [1176253]
- title: "MOBI metadata: Strip XML unsafe unicode codepoints when reading metadata from MOBI files."
tickets: [1175965]
- title: "Txt Input: Use the gbk encoding for txt files with detected encoding of gb2312."
tickets: [1175974]
- title: "When pressing Ctrl+Home/End preserve the horizontal scroll position in the book list"
improved recipes:
- NSFW
- Go Comics
- Various Polish news sources
- The Sun
- version: 0.9.29 - version: 0.9.29
date: 2013-05-03 date: 2013-05-03

View File

@ -504,6 +504,31 @@ There is a search bar at the top of the Tag Browser that allows you to easily fi
You can control how items are sorted in the Tag browser via the box at the bottom of the Tag Browser. You can choose to sort by name, average rating or popularity (popularity is the number of books with an item in your library; for example, the popularity of Isaac Asimov is the number of books in your library by Isaac Asimov). You can control how items are sorted in the Tag browser via the box at the bottom of the Tag Browser. You can choose to sort by name, average rating or popularity (popularity is the number of books with an item in your library; for example, the popularity of Isaac Asimov is the number of books in your library by Isaac Asimov).
Quickview
----------
Sometimes you want to to select a book and quickly get a list of books with the same value in some category (authors, tags, publisher, series, etc) as the currently selected book, but without changing the current view of the library. You can do this with Quickview. Quickview opens a second window showing the list of books matching the value of interest.
For example, assume you want to see a list of all the books with the same author of the currently-selected book. Click in the author cell you are interested in and press the 'Q' key. A window will open with all the authors for that book on the left, and all the books by the selected author on the right.
Some example Quickview usages: quickly seeing what other books:
- have some tag that is applied to the currently selected book,
- are in the same series as the current book
- have the same values in a custom column as the current book
- are written by one of the same authors of the current book
without changing the contents of the library view.
The Quickview window opens on top of the |app| window and will stay open until you explicitly close it. You can use Quickview and the |app| library view at the same time. For example, if in the |app| library view you click on a category column (tags, series, publisher, authors, etc) for a book, the Quickview window contents will change to show you in the left-hand side pane the items in that category for the selected book (e.g., the tags for that book). The first item in that list will be selected, and Quickview will show you on the right-hand side pane all the books in your library that reference that item. Click on an different item in the left-hand pane to see the books with that different item.
Double-click on a book in the Quickview window to select that book in the library view. This will also change the items display in the QuickView window(the left-hand pane) to show the items in the newly-selected book.
Shift- (or Ctrl-) double-click on a book in the Quickview window to open the edit metadata dialog on that book in the |app| window.
You can see if a column can be Quickview'ed by hovering your mouse over the column heading and looking at the tooltip for that heading. You can also know by right-clicking on the column heading to see of the "Quickview" option is shown in the menu, in which case choosing that Quickview option is equivalent to pressing 'Q' in the current cell.
Quickview respects the virtual library setting, showing only books in the current virtual library.
Jobs Jobs
----- -----
.. image:: images/jobs.png .. image:: images/jobs.png

View File

@ -57,6 +57,26 @@ library. The virtual library will then be created based on the search
you just typed in. Searches are very powerful, for examples of the kinds you just typed in. Searches are very powerful, for examples of the kinds
of things you can do with them, see :ref:`search_interface`. of things you can do with them, see :ref:`search_interface`.
Examples of useful Virtual Libraries
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
* Books added to |app| in the last day::
date:>1daysago
* Books added to |app| in the last month::
date:>30daysago
* Books with a rating of 5 stars::
rating:5
* Books with a rating of at least 4 stars::
rating:>=4
* Books with no rating::
rating:false
* Periodicals downloaded by the Fetch News function in |app|::
tags:=News and author:=calibre
* Books with no tags::
tags:false
* Books with no covers::
cover:false
Working with Virtual Libraries Working with Virtual Libraries
------------------------------------- -------------------------------------

View File

@ -1,224 +0,0 @@
from calibre.web.feeds.news import BasicNewsRecipe
class Comics(BasicNewsRecipe):
title = 'Comics.com'
__author__ = 'Starson17'
description = 'Comics from comics.com. You should customize this recipe to fetch only the comics you are interested in'
language = 'en'
use_embedded_content= False
no_stylesheets = True
oldest_article = 24
remove_javascript = True
cover_url = 'http://www.bsb.lib.tx.us/images/comics.com.gif'
recursions = 0
max_articles_per_feed = 10
num_comics_to_get = 7
simultaneous_downloads = 1
# delay = 3
keep_only_tags = [dict(name='a', attrs={'class':'STR_StripImage'}),
dict(name='div', attrs={'class':'STR_Date'})
]
def parse_index(self):
feeds = []
for title, url in [
("9 Chickweed Lane", "http://comics.com/9_chickweed_lane"),
("Agnes", "http://comics.com/agnes"),
("Alley Oop", "http://comics.com/alley_oop"),
("Andy Capp", "http://comics.com/andy_capp"),
("Arlo & Janis", "http://comics.com/arlo&janis"),
("B.C.", "http://comics.com/bc"),
("Ballard Street", "http://comics.com/ballard_street"),
# ("Ben", "http://comics.com/ben"),
# ("Betty", "http://comics.com/betty"),
# ("Big Nate", "http://comics.com/big_nate"),
# ("Brevity", "http://comics.com/brevity"),
# ("Candorville", "http://comics.com/candorville"),
# ("Cheap Thrills", "http://comics.com/cheap_thrills"),
# ("Committed", "http://comics.com/committed"),
# ("Cow & Boy", "http://comics.com/cow&boy"),
# ("Daddy's Home", "http://comics.com/daddys_home"),
# ("Dog eat Doug", "http://comics.com/dog_eat_doug"),
# ("Drabble", "http://comics.com/drabble"),
# ("F Minus", "http://comics.com/f_minus"),
# ("Family Tree", "http://comics.com/family_tree"),
# ("Farcus", "http://comics.com/farcus"),
# ("Fat Cats Classics", "http://comics.com/fat_cats_classics"),
# ("Ferd'nand", "http://comics.com/ferdnand"),
# ("Flight Deck", "http://comics.com/flight_deck"),
# ("Flo & Friends", "http://comics.com/flo&friends"),
# ("Fort Knox", "http://comics.com/fort_knox"),
# ("Frank & Ernest", "http://comics.com/frank&ernest"),
# ("Frazz", "http://comics.com/frazz"),
# ("Free Range", "http://comics.com/free_range"),
# ("Geech Classics", "http://comics.com/geech_classics"),
# ("Get Fuzzy", "http://comics.com/get_fuzzy"),
# ("Girls & Sports", "http://comics.com/girls&sports"),
# ("Graffiti", "http://comics.com/graffiti"),
# ("Grand Avenue", "http://comics.com/grand_avenue"),
# ("Heathcliff", "http://comics.com/heathcliff"),
# "Heathcliff, a street-smart and mischievous cat with many adventures."
# ("Herb and Jamaal", "http://comics.com/herb_and_jamaal"),
# ("Herman", "http://comics.com/herman"),
# ("Home and Away", "http://comics.com/home_and_away"),
# ("It's All About You", "http://comics.com/its_all_about_you"),
# ("Jane's World", "http://comics.com/janes_world"),
# ("Jump Start", "http://comics.com/jump_start"),
# ("Kit 'N' Carlyle", "http://comics.com/kit_n_carlyle"),
# ("Li'l Abner Classics", "http://comics.com/lil_abner_classics"),
# ("Liberty Meadows", "http://comics.com/liberty_meadows"),
# ("Little Dog Lost", "http://comics.com/little_dog_lost"),
# ("Lola", "http://comics.com/lola"),
# ("Luann", "http://comics.com/luann"),
# ("Marmaduke", "http://comics.com/marmaduke"),
# ("Meg! Classics", "http://comics.com/meg_classics"),
# ("Minimum Security", "http://comics.com/minimum_security"),
# ("Moderately Confused", "http://comics.com/moderately_confused"),
# ("Momma", "http://comics.com/momma"),
# ("Monty", "http://comics.com/monty"),
# ("Motley Classics", "http://comics.com/motley_classics"),
# ("Nancy", "http://comics.com/nancy"),
# ("Natural Selection", "http://comics.com/natural_selection"),
# ("Nest Heads", "http://comics.com/nest_heads"),
# ("Off The Mark", "http://comics.com/off_the_mark"),
# ("On a Claire Day", "http://comics.com/on_a_claire_day"),
# ("One Big Happy Classics", "http://comics.com/one_big_happy_classics"),
# ("Over the Hedge", "http://comics.com/over_the_hedge"),
# ("PC and Pixel", "http://comics.com/pc_and_pixel"),
# ("Peanuts", "http://comics.com/peanuts"),
# ("Pearls Before Swine", "http://comics.com/pearls_before_swine"),
# ("Pickles", "http://comics.com/pickles"),
# ("Prickly City", "http://comics.com/prickly_city"),
# ("Raising Duncan Classics", "http://comics.com/raising_duncan_classics"),
# ("Reality Check", "http://comics.com/reality_check"),
# ("Red & Rover", "http://comics.com/red&rover"),
# ("Rip Haywire", "http://comics.com/rip_haywire"),
# ("Ripley's Believe It or Not!", "http://comics.com/ripleys_believe_it_or_not"),
# ("Rose Is Rose", "http://comics.com/rose_is_rose"),
# ("Rubes", "http://comics.com/rubes"),
# ("Rudy Park", "http://comics.com/rudy_park"),
# ("Scary Gary", "http://comics.com/scary_gary"),
# ("Shirley and Son Classics", "http://comics.com/shirley_and_son_classics"),
# ("Soup To Nutz", "http://comics.com/soup_to_nutz"),
# ("Speed Bump", "http://comics.com/speed_bump"),
# ("Spot The Frog", "http://comics.com/spot_the_frog"),
# ("State of the Union", "http://comics.com/state_of_the_union"),
# ("Strange Brew", "http://comics.com/strange_brew"),
# ("Tarzan Classics", "http://comics.com/tarzan_classics"),
# ("That's Life", "http://comics.com/thats_life"),
# ("The Barn", "http://comics.com/the_barn"),
# ("The Born Loser", "http://comics.com/the_born_loser"),
# ("The Buckets", "http://comics.com/the_buckets"),
# ("The Dinette Set", "http://comics.com/the_dinette_set"),
# ("The Grizzwells", "http://comics.com/the_grizzwells"),
# ("The Humble Stumble", "http://comics.com/the_humble_stumble"),
# ("The Knight Life", "http://comics.com/the_knight_life"),
# ("The Meaning of Lila", "http://comics.com/the_meaning_of_lila"),
# ("The Other Coast", "http://comics.com/the_other_coast"),
# ("The Sunshine Club", "http://comics.com/the_sunshine_club"),
# ("Unstrange Phenomena", "http://comics.com/unstrange_phenomena"),
# ("Watch Your Head", "http://comics.com/watch_your_head"),
# ("Wizard of Id", "http://comics.com/wizard_of_id"),
# ("Working Daze", "http://comics.com/working_daze"),
# ("Working It Out", "http://comics.com/working_it_out"),
# ("Zack Hill", "http://comics.com/zack_hill"),
# ("(Th)ink", "http://comics.com/think"),
# "Tackling the political and social issues impacting communities of color."
# ("Adam Zyglis", "http://comics.com/adam_zyglis"),
# "Known for his excellent caricatures, as well as independent and incisive imagery. "
# ("Andy Singer", "http://comics.com/andy_singer"),
# ("Bill Day", "http://comics.com/bill_day"),
# "Powerful images on sensitive issues."
# ("Bill Schorr", "http://comics.com/bill_schorr"),
# ("Bob Englehart", "http://comics.com/bob_englehart"),
# ("Brian Fairrington", "http://comics.com/brian_fairrington"),
# ("Bruce Beattie", "http://comics.com/bruce_beattie"),
# ("Cam Cardow", "http://comics.com/cam_cardow"),
# ("Chip Bok", "http://comics.com/chip_bok"),
# ("Chris Britt", "http://comics.com/chris_britt"),
# ("Chuck Asay", "http://comics.com/chuck_asay"),
# ("Clay Bennett", "http://comics.com/clay_bennett"),
# ("Daryl Cagle", "http://comics.com/daryl_cagle"),
# ("David Fitzsimmons", "http://comics.com/david_fitzsimmons"),
# "David Fitzsimmons is a new editorial cartoons on comics.com. He is also a staff writer and editorial cartoonist for the Arizona Daily Star. "
# ("Drew Litton", "http://comics.com/drew_litton"),
# "Drew Litton is an artist who is probably best known for his sports cartoons. He received the National Cartoonist Society Sports Cartoon Award for 1993. "
# ("Ed Stein", "http://comics.com/ed_stein"),
# "Winner of the Fischetti Award in 2006 and the Scripps Howard National Journalism Award, 1999, Ed Stein has been the editorial cartoonist for the Rocky Mountain News since 1978. "
# ("Eric Allie", "http://comics.com/eric_allie"),
# "Eric Allie is an editorial cartoonist with the Pioneer Press and CNS News. "
# ("Gary Markstein", "http://comics.com/gary_markstein"),
# ("Gary McCoy", "http://comics.com/gary_mccoy"),
# "Gary McCoy is known for his editorial cartoons, humor and inane ramblings. He is a 2 time nominee for Best Magazine Cartoonist of the Year by the National Cartoonists Society. He resides in Belleville, IL. "
# ("Gary Varvel", "http://comics.com/gary_varvel"),
# ("Henry Payne", "http://comics.com/henry_payne"),
# ("JD Crowe", "http://comics.com/jd_crowe"),
# ("Jeff Parker", "http://comics.com/jeff_parker"),
# ("Jeff Stahler", "http://comics.com/jeff_stahler"),
# ("Jerry Holbert", "http://comics.com/jerry_holbert"),
# ("John Cole", "http://comics.com/john_cole"),
# ("John Darkow", "http://comics.com/john_darkow"),
# "John Darkow is a contributing editorial cartoonist for the Humor Times as well as editoiral cartoonist for the Columbia Daily Tribune, Missouri"
# ("John Sherffius", "http://comics.com/john_sherffius"),
# ("Larry Wright", "http://comics.com/larry_wright"),
# ("Lisa Benson", "http://comics.com/lisa_benson"),
# ("Marshall Ramsey", "http://comics.com/marshall_ramsey"),
# ("Matt Bors", "http://comics.com/matt_bors"),
# ("Michael Ramirez", "http://comics.com/michael_ramirez"),
# ("Mike Keefe", "http://comics.com/mike_keefe"),
# ("Mike Luckovich", "http://comics.com/mike_luckovich"),
# ("MIke Thompson", "http://comics.com/mike_thompson"),
# ("Monte Wolverton", "http://comics.com/monte_wolverton"),
# "Unique mix of perspectives"
# ("Mr. Fish", "http://comics.com/mr_fish"),
# "Side effects may include swelling"
# ("Nate Beeler", "http://comics.com/nate_beeler"),
# "Middle America meets the Beltway."
# ("Nick Anderson", "http://comics.com/nick_anderson"),
# ("Pat Bagley", "http://comics.com/pat_bagley"),
# "Unfair and Totally Unbalanced."
# ("Paul Szep", "http://comics.com/paul_szep"),
# ("RJ Matson", "http://comics.com/rj_matson"),
# "Power cartoons from NYC and Capitol Hill"
# ("Rob Rogers", "http://comics.com/rob_rogers"),
# "Humorous slant on current events"
# ("Robert Ariail", "http://comics.com/robert_ariail"),
# "Clever and unpredictable"
# ("Scott Stantis", "http://comics.com/scott_stantis"),
# ("Signe Wilkinson", "http://comics.com/signe_wilkinson"),
# ("Steve Benson", "http://comics.com/steve_benson"),
# ("Steve Breen", "http://comics.com/steve_breen"),
# ("Steve Kelley", "http://comics.com/steve_kelley"),
# ("Steve Sack", "http://comics.com/steve_sack"),
]:
articles = self.make_links(url)
if articles:
feeds.append((title, articles))
return feeds
def make_links(self, url):
soup = self.index_to_soup(url)
# print 'soup: ', soup
title = ''
current_articles = []
pages = range(1, self.num_comics_to_get+1)
for page in pages:
page_url = url + '/?Page=' + str(page)
soup = self.index_to_soup(page_url)
if soup:
strip_tag = soup.find('a', attrs={'class': 'STR_StripImage'})
if strip_tag:
print 'strip_tag: ', strip_tag
title = strip_tag['title']
print 'title: ', title
current_articles.append({'title': title, 'url': page_url, 'description':'', 'date':''})
current_articles.reverse()
return current_articles
extra_css = '''
h1{font-family:Arial,Helvetica,sans-serif; font-weight:bold;font-size:large;}
h2{font-family:Arial,Helvetica,sans-serif; font-weight:normal;font-size:small;}
p{font-family:Arial,Helvetica,sans-serif;font-size:small;}
body{font-family:Helvetica,Arial,sans-serif;font-size:small;}
'''

View File

@ -1,4 +1,5 @@
#!/usr/bin/env python #!/usr/bin/env python
# vim:fileencoding=UTF-8
__license__ = 'GPL v3' __license__ = 'GPL v3'
__author__ = 'Mori' __author__ = 'Mori'

View File

@ -1,32 +1,37 @@
# vim:fileencoding=UTF-8:ts=4:sw=4:sta:et:sts=4:ai
from calibre.web.feeds.news import BasicNewsRecipe from calibre.web.feeds.news import BasicNewsRecipe
import re import re
class NetMagazineRecipe (BasicNewsRecipe): class dotnetMagazine (BasicNewsRecipe):
__author__ = u'Marc Busqué <marc@lamarciana.com>' __author__ = u'Bonni Salles'
__url__ = 'http://www.lamarciana.com' __version__ = '1.0'
__version__ = '1.0' __license__ = 'GPL v3'
__license__ = 'GPL v3' __copyright__ = u'2013, Bonni Salles'
__copyright__ = u'2012, Marc Busqué <marc@lamarciana.com>' title = '.net magazine'
title = u'.net magazine' oldest_article = 7
description = u'net is the worlds best-selling magazine for web designers and developers, featuring tutorials from leading agencies, interviews with the webs biggest names, and agenda-setting features on the hottest issues affecting the internet today.' no_stylesheets = True
language = 'en' encoding = 'utf8'
tags = 'web development, software' use_embedded_content = False
oldest_article = 7 language = 'en'
remove_empty_feeds = True remove_empty_feeds = True
no_stylesheets = True extra_css = ' body{font-family: Arial,Helvetica,sans-serif } img{margin-bottom: 0.4em} '
cover_url = u'http://media.netmagazine.futurecdn.net/sites/all/themes/netmag/logo.png' cover_url = u'http://media.netmagazine.futurecdn.net/sites/all/themes/netmag/logo.png'
keep_only_tags = [
dict(name='article', attrs={'class': re.compile('^node.*$', re.IGNORECASE)}) remove_tags_after = dict(name='footer', id=lambda x:not x)
] remove_tags_before = dict(name='header', id=lambda x:not x)
remove_tags = [
dict(name='span', attrs={'class': 'comment-count'}), remove_tags = [
dict(name='div', attrs={'class': 'item-list share-links'}), dict(name='div', attrs={'class': 'item-list'}),
dict(name='footer'), dict(name='h4', attrs={'class': 'std-hdr'}),
] dict(name='div', attrs={'class': 'item-list share-links'}), #removes share links
remove_attributes = ['border', 'cellspacing', 'align', 'cellpadding', 'colspan', 'valign', 'vspace', 'hspace', 'alt', 'width', 'height', 'style'] dict(name=['script', 'noscript']),
extra_css = 'img {max-width: 100%; display: block; margin: auto;} .captioned-image div {text-align: center; font-style: italic;}' dict(name='div', attrs={'id': 'comments-form'}), #comment these out if you want the comments to show
dict(name='div', attrs={'id': re.compile('advertorial_block_($|| )')}),
dict(name='div', attrs={'id': 'right-col'}),
dict(name='div', attrs={'id': 'comments'}), #comment these out if you want the comments to show
dict(name='div', attrs={'class': 'item-list related-content'}),
feeds = [
(u'.net', u'http://feeds.feedburner.com/net/topstories'),
] ]
feeds = [
(u'net', u'http://feeds.feedburner.com/net/topstories')
]

View File

@ -1,229 +1,443 @@
__license__ = 'GPL v3'
__copyright__ = 'Copyright 2010 Starson17'
'''
www.gocomics.com
'''
from calibre.web.feeds.news import BasicNewsRecipe from calibre.web.feeds.news import BasicNewsRecipe
import re
class GoComics(BasicNewsRecipe):
class Comics(BasicNewsRecipe): title = 'Go Comics'
title = 'Comics.com'
__author__ = 'Starson17' __author__ = 'Starson17'
description = 'Comics from comics.com. You should customize this recipe to fetch only the comics you are interested in' __version__ = '1.06'
__date__ = '07 June 2011'
description = u'200+ Comics - Customize for more days/comics: Defaults to 7 days, 25 comics - 20 general, 5 editorial.'
category = 'news, comics'
language = 'en' language = 'en'
use_embedded_content= False use_embedded_content= False
no_stylesheets = True no_stylesheets = True
oldest_article = 24
remove_javascript = True remove_javascript = True
cover_url = 'http://www.bsb.lib.tx.us/images/comics.com.gif' remove_attributes = ['style']
recursions = 0
max_articles_per_feed = 10
num_comics_to_get = 7
simultaneous_downloads = 1
# delay = 3
keep_only_tags = [dict(name='h1'), ####### USER PREFERENCES - COMICS, IMAGE SIZE AND NUMBER OF COMICS TO RETRIEVE ########
dict(name='p', attrs={'class':'feature_item'}) # num_comics_to_get - I've tried up to 99 on Calvin&Hobbes
num_comics_to_get = 1
# comic_size 300 is small, 600 is medium, 900 is large, 1500 is extra-large
comic_size = 900
# CHOOSE COMIC STRIPS BELOW - REMOVE COMMENT '# ' FROM IN FRONT OF DESIRED STRIPS
# Please do not overload their servers by selecting all comics and 1000 strips from each!
conversion_options = {'linearize_tables' : True
, 'comment' : description
, 'tags' : category
, 'language' : language
}
keep_only_tags = [dict(name='div', attrs={'class':['feature','banner']}),
] ]
remove_tags = [dict(name='a', attrs={'class':['beginning','prev','cal','next','newest']}),
dict(name='div', attrs={'class':['tag-wrapper']}),
dict(name='a', attrs={'href':re.compile(r'.*mutable_[0-9]+', re.IGNORECASE)}),
dict(name='img', attrs={'src':re.compile(r'.*mutable_[0-9]+', re.IGNORECASE)}),
dict(name='ul', attrs={'class':['share-nav','feature-nav']}),
]
def get_browser(self):
br = BasicNewsRecipe.get_browser(self)
br.addheaders = [('Referer','http://www.gocomics.com/')]
return br
def parse_index(self): def parse_index(self):
feeds = [] feeds = []
for title, url in [ for title, url in [
("9 Chickweed Lane", "http://gocomics.com/9_chickweed_lane"), #(u"2 Cows and a Chicken", u"http://www.gocomics.com/2cowsandachicken"),
("Agnes", "http://gocomics.com/agnes"), #(u"9 Chickweed Lane", u"http://www.gocomics.com/9chickweedlane"),
("Alley Oop", "http://gocomics.com/alley_oop"), #(u"Adam At Home", u"http://www.gocomics.com/adamathome"),
("Andy Capp", "http://gocomics.com/andy_capp"), #(u"Agnes", u"http://www.gocomics.com/agnes"),
("Arlo & Janis", "http://gocomics.com/arlo&janis"), #(u"Alley Oop", u"http://www.gocomics.com/alleyoop"),
("B.C.", "http://gocomics.com/bc"), #(u"Andy Capp", u"http://www.gocomics.com/andycapp"),
("Ballard Street", "http://gocomics.com/ballard_street"), (u"Animal Crackers", u"http://www.gocomics.com/animalcrackers"),
# ("Ben", "http://comics.com/ben"), #(u"Annie", u"http://www.gocomics.com/annie"),
# ("Betty", "http://comics.com/betty"), #(u"Arlo & Janis", u"http://www.gocomics.com/arloandjanis"),
# ("Big Nate", "http://comics.com/big_nate"), #(u"Ask Shagg", u"http://www.gocomics.com/askshagg"),
# ("Brevity", "http://comics.com/brevity"), (u"B.C.", u"http://www.gocomics.com/bc"),
# ("Candorville", "http://comics.com/candorville"), #(u"Back in the Day", u"http://www.gocomics.com/backintheday"),
# ("Cheap Thrills", "http://comics.com/cheap_thrills"), #(u"Bad Reporter", u"http://www.gocomics.com/badreporter"),
# ("Committed", "http://comics.com/committed"), (u"Baldo", u"http://www.gocomics.com/baldo"),
# ("Cow & Boy", "http://comics.com/cow&boy"), #(u"Ballard Street", u"http://www.gocomics.com/ballardstreet"),
# ("Daddy's Home", "http://comics.com/daddys_home"), #(u"Barkeater Lake", u"http://www.gocomics.com/barkeaterlake"),
# ("Dog eat Doug", "http://comics.com/dog_eat_doug"), #(u"Basic Instructions", u"http://www.gocomics.com/basicinstructions"),
# ("Drabble", "http://comics.com/drabble"), #(u"Ben", u"http://www.gocomics.com/ben"),
# ("F Minus", "http://comics.com/f_minus"), #(u"Betty", u"http://www.gocomics.com/betty"),
# ("Family Tree", "http://comics.com/family_tree"), #(u"Bewley", u"http://www.gocomics.com/bewley"),
# ("Farcus", "http://comics.com/farcus"), #(u"Big Nate", u"http://www.gocomics.com/bignate"),
# ("Fat Cats Classics", "http://comics.com/fat_cats_classics"), #(u"Big Top", u"http://www.gocomics.com/bigtop"),
# ("Ferd'nand", "http://comics.com/ferdnand"), #(u"Biographic", u"http://www.gocomics.com/biographic"),
# ("Flight Deck", "http://comics.com/flight_deck"), #(u"Birdbrains", u"http://www.gocomics.com/birdbrains"),
# ("Flo & Friends", "http://comics.com/flo&friends"), #(u"Bleeker: The Rechargeable Dog", u"http://www.gocomics.com/bleeker"),
# ("Fort Knox", "http://comics.com/fort_knox"), #(u"Bliss", u"http://www.gocomics.com/bliss"),
# ("Frank & Ernest", "http://comics.com/frank&ernest"), #(u"Bloom County", u"http://www.gocomics.com/bloomcounty"),
# ("Frazz", "http://comics.com/frazz"), #(u"Bo Nanas", u"http://www.gocomics.com/bonanas"),
# ("Free Range", "http://comics.com/free_range"), #(u"Bob the Squirrel", u"http://www.gocomics.com/bobthesquirrel"),
# ("Geech Classics", "http://comics.com/geech_classics"), #(u"Boomerangs", u"http://www.gocomics.com/boomerangs"),
# ("Get Fuzzy", "http://comics.com/get_fuzzy"), #(u"Bottomliners", u"http://www.gocomics.com/bottomliners"),
# ("Girls & Sports", "http://comics.com/girls&sports"), (u"Bound and Gagged", u"http://www.gocomics.com/boundandgagged"),
# ("Graffiti", "http://comics.com/graffiti"), #(u"Brainwaves", u"http://www.gocomics.com/brainwaves"),
# ("Grand Avenue", "http://comics.com/grand_avenue"), #(u"Brenda Starr", u"http://www.gocomics.com/brendastarr"),
# ("Heathcliff", "http://comics.com/heathcliff"), #(u"Brevity", u"http://www.gocomics.com/brevity"),
# "Heathcliff, a street-smart and mischievous cat with many adventures." #(u"Brewster Rockit", u"http://www.gocomics.com/brewsterrockit"),
# ("Herb and Jamaal", "http://comics.com/herb_and_jamaal"), (u"Broom Hilda", u"http://www.gocomics.com/broomhilda"),
# ("Herman", "http://comics.com/herman"), (u"Calvin and Hobbes", u"http://www.gocomics.com/calvinandhobbes"),
# ("Home and Away", "http://comics.com/home_and_away"), #(u"Candorville", u"http://www.gocomics.com/candorville"),
# ("It's All About You", "http://comics.com/its_all_about_you"), #(u"Cathy", u"http://www.gocomics.com/cathy"),
# ("Jane's World", "http://comics.com/janes_world"), #(u"C'est la Vie", u"http://www.gocomics.com/cestlavie"),
# ("Jump Start", "http://comics.com/jump_start"), #(u"Cheap Thrills", u"http://www.gocomics.com/cheapthrills"),
# ("Kit 'N' Carlyle", "http://comics.com/kit_n_carlyle"), #(u"Chuckle Bros", u"http://www.gocomics.com/chucklebros"),
# ("Li'l Abner Classics", "http://comics.com/lil_abner_classics"), #(u"Citizen Dog", u"http://www.gocomics.com/citizendog"),
# ("Liberty Meadows", "http://comics.com/liberty_meadows"), #(u"Cleats", u"http://www.gocomics.com/cleats"),
# ("Little Dog Lost", "http://comics.com/little_dog_lost"), #(u"Close to Home", u"http://www.gocomics.com/closetohome"),
# ("Lola", "http://comics.com/lola"), #(u"Committed", u"http://www.gocomics.com/committed"),
# ("Luann", "http://comics.com/luann"), #(u"Compu-toon", u"http://www.gocomics.com/compu-toon"),
# ("Marmaduke", "http://comics.com/marmaduke"), #(u"Cornered", u"http://www.gocomics.com/cornered"),
# ("Meg! Classics", "http://comics.com/meg_classics"), #(u"Cow & Boy", u"http://www.gocomics.com/cow&boy"),
# ("Minimum Security", "http://comics.com/minimum_security"), #(u"Cul de Sac", u"http://www.gocomics.com/culdesac"),
# ("Moderately Confused", "http://comics.com/moderately_confused"), #(u"Daddy's Home", u"http://www.gocomics.com/daddyshome"),
# ("Momma", "http://comics.com/momma"), #(u"Deep Cover", u"http://www.gocomics.com/deepcover"),
# ("Monty", "http://comics.com/monty"), #(u"Dick Tracy", u"http://www.gocomics.com/dicktracy"),
# ("Motley Classics", "http://comics.com/motley_classics"), #(u"Dog Eat Doug", u"http://www.gocomics.com/dogeatdoug"),
# ("Nancy", "http://comics.com/nancy"), #(u"Domestic Abuse", u"http://www.gocomics.com/domesticabuse"),
# ("Natural Selection", "http://comics.com/natural_selection"), #(u"Doodles", u"http://www.gocomics.com/doodles"),
# ("Nest Heads", "http://comics.com/nest_heads"), #(u"Doonesbury", u"http://www.gocomics.com/doonesbury"),
# ("Off The Mark", "http://comics.com/off_the_mark"), #(u"Drabble", u"http://www.gocomics.com/drabble"),
# ("On a Claire Day", "http://comics.com/on_a_claire_day"), #(u"Eek!", u"http://www.gocomics.com/eek"),
# ("One Big Happy Classics", "http://comics.com/one_big_happy_classics"), #(u"F Minus", u"http://www.gocomics.com/fminus"),
# ("Over the Hedge", "http://comics.com/over_the_hedge"), #(u"Family Tree", u"http://www.gocomics.com/familytree"),
# ("PC and Pixel", "http://comics.com/pc_and_pixel"), #(u"Farcus", u"http://www.gocomics.com/farcus"),
# ("Peanuts", "http://comics.com/peanuts"), #(u"Fat Cats Classics", u"http://www.gocomics.com/fatcatsclassics"),
# ("Pearls Before Swine", "http://comics.com/pearls_before_swine"), #(u"Ferd'nand", u"http://www.gocomics.com/ferdnand"),
# ("Pickles", "http://comics.com/pickles"), #(u"Flight Deck", u"http://www.gocomics.com/flightdeck"),
# ("Prickly City", "http://comics.com/prickly_city"), #(u"Flo and Friends", u"http://www.gocomics.com/floandfriends"),
# ("Raising Duncan Classics", "http://comics.com/raising_duncan_classics"), (u"For Better or For Worse", u"http://www.gocomics.com/forbetterorforworse"),
# ("Reality Check", "http://comics.com/reality_check"), #(u"For Heaven's Sake", u"http://www.gocomics.com/forheavenssake"),
# ("Red & Rover", "http://comics.com/red&rover"), #(u"Fort Knox", u"http://www.gocomics.com/fortknox"),
# ("Rip Haywire", "http://comics.com/rip_haywire"), #(u"FoxTrot Classics", u"http://www.gocomics.com/foxtrotclassics"),
# ("Ripley's Believe It or Not!", "http://comics.com/ripleys_believe_it_or_not"), #(u"FoxTrot", u"http://www.gocomics.com/foxtrot"),
# ("Rose Is Rose", "http://comics.com/rose_is_rose"), (u"Frank & Ernest", u"http://www.gocomics.com/frankandernest"),
# ("Rubes", "http://comics.com/rubes"), #(u"Frazz", u"http://www.gocomics.com/frazz"),
# ("Rudy Park", "http://comics.com/rudy_park"), #(u"Fred Basset", u"http://www.gocomics.com/fredbasset"),
# ("Scary Gary", "http://comics.com/scary_gary"), #(u"Free Range", u"http://www.gocomics.com/freerange"),
# ("Shirley and Son Classics", "http://comics.com/shirley_and_son_classics"), #(u"Frog Applause", u"http://www.gocomics.com/frogapplause"),
# ("Soup To Nutz", "http://comics.com/soup_to_nutz"), #(u"Garfield Minus Garfield", u"http://www.gocomics.com/garfieldminusgarfield"),
# ("Speed Bump", "http://comics.com/speed_bump"), (u"Garfield", u"http://www.gocomics.com/garfield"),
# ("Spot The Frog", "http://comics.com/spot_the_frog"), #(u"Gasoline Alley", u"http://www.gocomics.com/gasolinealley"),
# ("State of the Union", "http://comics.com/state_of_the_union"), #(u"Geech Classics", u"http://www.gocomics.com/geechclassics"),
# ("Strange Brew", "http://comics.com/strange_brew"), (u"Get Fuzzy", u"http://www.gocomics.com/getfuzzy"),
# ("Tarzan Classics", "http://comics.com/tarzan_classics"), #(u"Gil Thorp", u"http://www.gocomics.com/gilthorp"),
# ("That's Life", "http://comics.com/thats_life"), #(u"Ginger Meggs", u"http://www.gocomics.com/gingermeggs"),
# ("The Barn", "http://comics.com/the_barn"), #(u"Girls & Sports", u"http://www.gocomics.com/girlsandsports"),
# ("The Born Loser", "http://comics.com/the_born_loser"), #(u"Graffiti", u"http://www.gocomics.com/graffiti"),
# ("The Buckets", "http://comics.com/the_buckets"), #(u"Grand Avenue", u"http://www.gocomics.com/grandavenue"),
# ("The Dinette Set", "http://comics.com/the_dinette_set"), #(u"Haiku Ewe", u"http://www.gocomics.com/haikuewe"),
# ("The Grizzwells", "http://comics.com/the_grizzwells"), #(u"Heart of the City", u"http://www.gocomics.com/heartofthecity"),
# ("The Humble Stumble", "http://comics.com/the_humble_stumble"), #(u"Herb and Jamaal", u"http://www.gocomics.com/herbandjamaal"),
# ("The Knight Life", "http://comics.com/the_knight_life"), #(u"Home and Away", u"http://www.gocomics.com/homeandaway"),
# ("The Meaning of Lila", "http://comics.com/the_meaning_of_lila"), #(u"Housebroken", u"http://www.gocomics.com/housebroken"),
# ("The Other Coast", "http://comics.com/the_other_coast"), #(u"Hubert and Abby", u"http://www.gocomics.com/hubertandabby"),
# ("The Sunshine Club", "http://comics.com/the_sunshine_club"), #(u"Imagine This", u"http://www.gocomics.com/imaginethis"),
# ("Unstrange Phenomena", "http://comics.com/unstrange_phenomena"), #(u"In the Bleachers", u"http://www.gocomics.com/inthebleachers"),
# ("Watch Your Head", "http://comics.com/watch_your_head"), #(u"In the Sticks", u"http://www.gocomics.com/inthesticks"),
# ("Wizard of Id", "http://comics.com/wizard_of_id"), #(u"Ink Pen", u"http://www.gocomics.com/inkpen"),
# ("Working Daze", "http://comics.com/working_daze"), #(u"It's All About You", u"http://www.gocomics.com/itsallaboutyou"),
# ("Working It Out", "http://comics.com/working_it_out"), #(u"Jane's World", u"http://www.gocomics.com/janesworld"),
# ("Zack Hill", "http://comics.com/zack_hill"), #(u"Joe Vanilla", u"http://www.gocomics.com/joevanilla"),
# ("(Th)ink", "http://comics.com/think"), #(u"Jump Start", u"http://www.gocomics.com/jumpstart"),
# "Tackling the political and social issues impacting communities of color." #(u"Kit 'N' Carlyle", u"http://www.gocomics.com/kitandcarlyle"),
# ("Adam Zyglis", "http://comics.com/adam_zyglis"), #(u"La Cucaracha", u"http://www.gocomics.com/lacucaracha"),
# "Known for his excellent caricatures, as well as independent and incisive imagery. " #(u"Last Kiss", u"http://www.gocomics.com/lastkiss"),
# ("Andy Singer", "http://comics.com/andy_singer"), #(u"Legend of Bill", u"http://www.gocomics.com/legendofbill"),
# ("Bill Day", "http://comics.com/bill_day"), #(u"Liberty Meadows", u"http://www.gocomics.com/libertymeadows"),
# "Powerful images on sensitive issues." #(u"Li'l Abner Classics", u"http://www.gocomics.com/lilabnerclassics"),
# ("Bill Schorr", "http://comics.com/bill_schorr"), #(u"Lio", u"http://www.gocomics.com/lio"),
# ("Bob Englehart", "http://comics.com/bob_englehart"), #(u"Little Dog Lost", u"http://www.gocomics.com/littledoglost"),
# ("Brian Fairrington", "http://comics.com/brian_fairrington"), #(u"Little Otto", u"http://www.gocomics.com/littleotto"),
# ("Bruce Beattie", "http://comics.com/bruce_beattie"), #(u"Lola", u"http://www.gocomics.com/lola"),
# ("Cam Cardow", "http://comics.com/cam_cardow"), #(u"Love Is...", u"http://www.gocomics.com/loveis"),
# ("Chip Bok", "http://comics.com/chip_bok"), (u"Luann", u"http://www.gocomics.com/luann"),
# ("Chris Britt", "http://comics.com/chris_britt"), #(u"Maintaining", u"http://www.gocomics.com/maintaining"),
# ("Chuck Asay", "http://comics.com/chuck_asay"), #(u"Meg! Classics", u"http://www.gocomics.com/megclassics"),
# ("Clay Bennett", "http://comics.com/clay_bennett"), #(u"Middle-Aged White Guy", u"http://www.gocomics.com/middleagedwhiteguy"),
# ("Daryl Cagle", "http://comics.com/daryl_cagle"), #(u"Minimum Security", u"http://www.gocomics.com/minimumsecurity"),
# ("David Fitzsimmons", "http://comics.com/david_fitzsimmons"), #(u"Moderately Confused", u"http://www.gocomics.com/moderatelyconfused"),
# "David Fitzsimmons is a new editorial cartoons on comics.com. He is also a staff writer and editorial cartoonist for the Arizona Daily Star. " (u"Momma", u"http://www.gocomics.com/momma"),
# ("Drew Litton", "http://comics.com/drew_litton"), #(u"Monty", u"http://www.gocomics.com/monty"),
# "Drew Litton is an artist who is probably best known for his sports cartoons. He received the National Cartoonist Society Sports Cartoon Award for 1993. " #(u"Motley Classics", u"http://www.gocomics.com/motleyclassics"),
# ("Ed Stein", "http://comics.com/ed_stein"), #(u"Mutt & Jeff", u"http://www.gocomics.com/muttandjeff"),
# "Winner of the Fischetti Award in 2006 and the Scripps Howard National Journalism Award, 1999, Ed Stein has been the editorial cartoonist for the Rocky Mountain News since 1978. " #(u"Mythtickle", u"http://www.gocomics.com/mythtickle"),
# ("Eric Allie", "http://comics.com/eric_allie"), #(u"Nancy", u"http://www.gocomics.com/nancy"),
# "Eric Allie is an editorial cartoonist with the Pioneer Press and CNS News. " #(u"Natural Selection", u"http://www.gocomics.com/naturalselection"),
# ("Gary Markstein", "http://comics.com/gary_markstein"), #(u"Nest Heads", u"http://www.gocomics.com/nestheads"),
# ("Gary McCoy", "http://comics.com/gary_mccoy"), #(u"NEUROTICA", u"http://www.gocomics.com/neurotica"),
# "Gary McCoy is known for his editorial cartoons, humor and inane ramblings. He is a 2 time nominee for Best Magazine Cartoonist of the Year by the National Cartoonists Society. He resides in Belleville, IL. " #(u"New Adventures of Queen Victoria", u"http://www.gocomics.com/thenewadventuresofqueenvictoria"),
# ("Gary Varvel", "http://comics.com/gary_varvel"), (u"Non Sequitur", u"http://www.gocomics.com/nonsequitur"),
# ("Henry Payne", "http://comics.com/henry_payne"), #(u"Off The Mark", u"http://www.gocomics.com/offthemark"),
# ("JD Crowe", "http://comics.com/jd_crowe"), #(u"On A Claire Day", u"http://www.gocomics.com/onaclaireday"),
# ("Jeff Parker", "http://comics.com/jeff_parker"), #(u"One Big Happy Classics", u"http://www.gocomics.com/onebighappyclassics"),
# ("Jeff Stahler", "http://comics.com/jeff_stahler"), #(u"One Big Happy", u"http://www.gocomics.com/onebighappy"),
# ("Jerry Holbert", "http://comics.com/jerry_holbert"), #(u"Out of the Gene Pool Re-Runs", u"http://www.gocomics.com/outofthegenepool"),
# ("John Cole", "http://comics.com/john_cole"), #(u"Over the Hedge", u"http://www.gocomics.com/overthehedge"),
# ("John Darkow", "http://comics.com/john_darkow"), #(u"Overboard", u"http://www.gocomics.com/overboard"),
# "John Darkow is a contributing editorial cartoonist for the Humor Times as well as editoiral cartoonist for the Columbia Daily Tribune, Missouri" #(u"PC and Pixel", u"http://www.gocomics.com/pcandpixel"),
# ("John Sherffius", "http://comics.com/john_sherffius"), (u"Peanuts", u"http://www.gocomics.com/peanuts"),
# ("Larry Wright", "http://comics.com/larry_wright"), (u"Pearls Before Swine", u"http://www.gocomics.com/pearlsbeforeswine"),
# ("Lisa Benson", "http://comics.com/lisa_benson"), #(u"Pibgorn Sketches", u"http://www.gocomics.com/pibgornsketches"),
# ("Marshall Ramsey", "http://comics.com/marshall_ramsey"), #(u"Pibgorn", u"http://www.gocomics.com/pibgorn"),
# ("Matt Bors", "http://comics.com/matt_bors"), #(u"Pickles", u"http://www.gocomics.com/pickles"),
# ("Michael Ramirez", "http://comics.com/michael_ramirez"), #(u"Pinkerton", u"http://www.gocomics.com/pinkerton"),
# ("Mike Keefe", "http://comics.com/mike_keefe"), #(u"Pluggers", u"http://www.gocomics.com/pluggers"),
# ("Mike Luckovich", "http://comics.com/mike_luckovich"), (u"Pooch Cafe", u"http://www.gocomics.com/poochcafe"),
# ("MIke Thompson", "http://comics.com/mike_thompson"), #(u"PreTeena", u"http://www.gocomics.com/preteena"),
# ("Monte Wolverton", "http://comics.com/monte_wolverton"), #(u"Prickly City", u"http://www.gocomics.com/pricklycity"),
# "Unique mix of perspectives" #(u"Rabbits Against Magic", u"http://www.gocomics.com/rabbitsagainstmagic"),
# ("Mr. Fish", "http://comics.com/mr_fish"), #(u"Raising Duncan Classics", u"http://www.gocomics.com/raisingduncanclassics"),
# "Side effects may include swelling" #(u"Real Life Adventures", u"http://www.gocomics.com/reallifeadventures"),
# ("Nate Beeler", "http://comics.com/nate_beeler"), #(u"Reality Check", u"http://www.gocomics.com/realitycheck"),
# "Middle America meets the Beltway." #(u"Red and Rover", u"http://www.gocomics.com/redandrover"),
# ("Nick Anderson", "http://comics.com/nick_anderson"), #(u"Red Meat", u"http://www.gocomics.com/redmeat"),
# ("Pat Bagley", "http://comics.com/pat_bagley"), #(u"Reynolds Unwrapped", u"http://www.gocomics.com/reynoldsunwrapped"),
# "Unfair and Totally Unbalanced." #(u"Rip Haywire", u"http://www.gocomics.com/riphaywire"),
# ("Paul Szep", "http://comics.com/paul_szep"), #(u"Ronaldinho Gaucho", u"http://www.gocomics.com/ronaldinhogaucho"),
# ("RJ Matson", "http://comics.com/rj_matson"), (u"Rose Is Rose", u"http://www.gocomics.com/roseisrose"),
# "Power cartoons from NYC and Capitol Hill" #(u"Rudy Park", u"http://www.gocomics.com/rudypark"),
# ("Rob Rogers", "http://comics.com/rob_rogers"), #(u"Scary Gary", u"http://www.gocomics.com/scarygary"),
# "Humorous slant on current events" #(u"Shirley and Son Classics", u"http://www.gocomics.com/shirleyandsonclassics"),
# ("Robert Ariail", "http://comics.com/robert_ariail"), (u"Shoe", u"http://www.gocomics.com/shoe"),
# "Clever and unpredictable" #(u"Shoecabbage", u"http://www.gocomics.com/shoecabbage"),
# ("Scott Stantis", "http://comics.com/scott_stantis"), #(u"Skin Horse", u"http://www.gocomics.com/skinhorse"),
# ("Signe Wilkinson", "http://comics.com/signe_wilkinson"), #(u"Slowpoke", u"http://www.gocomics.com/slowpoke"),
# ("Steve Benson", "http://comics.com/steve_benson"), #(u"Soup To Nutz", u"http://www.gocomics.com/souptonutz"),
# ("Steve Breen", "http://comics.com/steve_breen"), #(u"Spot The Frog", u"http://www.gocomics.com/spotthefrog"),
# ("Steve Kelley", "http://comics.com/steve_kelley"), #(u"State of the Union", u"http://www.gocomics.com/stateoftheunion"),
# ("Steve Sack", "http://comics.com/steve_sack"), #(u"Stone Soup", u"http://www.gocomics.com/stonesoup"),
]: #(u"Sylvia", u"http://www.gocomics.com/sylvia"),
#(u"Tank McNamara", u"http://www.gocomics.com/tankmcnamara"),
#(u"Tarzan Classics", u"http://www.gocomics.com/tarzanclassics"),
#(u"That's Life", u"http://www.gocomics.com/thatslife"),
#(u"The Academia Waltz", u"http://www.gocomics.com/academiawaltz"),
#(u"The Barn", u"http://www.gocomics.com/thebarn"),
#(u"The Boiling Point", u"http://www.gocomics.com/theboilingpoint"),
#(u"The Boondocks", u"http://www.gocomics.com/boondocks"),
(u"The Born Loser", u"http://www.gocomics.com/thebornloser"),
#(u"The Buckets", u"http://www.gocomics.com/thebuckets"),
#(u"The City", u"http://www.gocomics.com/thecity"),
#(u"The Dinette Set", u"http://www.gocomics.com/dinetteset"),
#(u"The Doozies", u"http://www.gocomics.com/thedoozies"),
#(u"The Duplex", u"http://www.gocomics.com/duplex"),
#(u"The Elderberries", u"http://www.gocomics.com/theelderberries"),
#(u"The Flying McCoys", u"http://www.gocomics.com/theflyingmccoys"),
#(u"The Fusco Brothers", u"http://www.gocomics.com/thefuscobrothers"),
#(u"The Grizzwells", u"http://www.gocomics.com/thegrizzwells"),
#(u"The Humble Stumble", u"http://www.gocomics.com/thehumblestumble"),
#(u"The Knight Life", u"http://www.gocomics.com/theknightlife"),
#(u"The Meaning of Lila", u"http://www.gocomics.com/meaningoflila"),
(u"The Middletons", u"http://www.gocomics.com/themiddletons"),
#(u"The Norm", u"http://www.gocomics.com/thenorm"),
#(u"The Other Coast", u"http://www.gocomics.com/theothercoast"),
#(u"The Quigmans", u"http://www.gocomics.com/thequigmans"),
#(u"The Sunshine Club", u"http://www.gocomics.com/thesunshineclub"),
#(u"Tiny Sepuk", u"http://www.gocomics.com/tinysepuk"),
#(u"TOBY", u"http://www.gocomics.com/toby"),
#(u"Tom the Dancing Bug", u"http://www.gocomics.com/tomthedancingbug"),
#(u"Too Much Coffee Man", u"http://www.gocomics.com/toomuchcoffeeman"),
#(u"Unstrange Phenomena", u"http://www.gocomics.com/unstrangephenomena"),
#(u"W.T. Duck", u"http://www.gocomics.com/wtduck"),
#(u"Watch Your Head", u"http://www.gocomics.com/watchyourhead"),
#(u"Wee Pals", u"http://www.gocomics.com/weepals"),
#(u"Winnie the Pooh", u"http://www.gocomics.com/winniethepooh"),
(u"Wizard of Id", u"http://www.gocomics.com/wizardofid"),
#(u"Working Daze", u"http://www.gocomics.com/workingdaze"),
#(u"Working It Out", u"http://www.gocomics.com/workingitout"),
#(u"Yenny", u"http://www.gocomics.com/yenny"),
#(u"Zack Hill", u"http://www.gocomics.com/zackhill"),
#(u"Ziggy", u"http://www.gocomics.com/ziggy"),
(u"9 to 5", u"http://www.gocomics.com/9to5"),
(u"Heathcliff", u"http://www.gocomics.com/heathcliff"),
(u"Herman", u"http://www.gocomics.com/herman"),
(u"Loose Parts", u"http://www.gocomics.com/looseparts"),
(u"Marmaduke", u"http://www.gocomics.com/marmaduke"),
(u"Ripley's Believe It or Not!", u"http://www.gocomics.com/ripleysbelieveitornot"),
(u"Rubes", u"http://www.gocomics.com/rubes"),
(u"Speed Bump", u"http://www.gocomics.com/speedbump"),
(u"Strange Brew", u"http://www.gocomics.com/strangebrew"),
(u"The Argyle Sweater", u"http://www.gocomics.com/theargylesweater"),
#
######## EDITORIAL CARTOONS #####################
#(u"Adam Zyglis", u"http://www.gocomics.com/adamzyglis"),
#(u"Andy Singer", u"http://www.gocomics.com/andysinger"),
#(u"Ben Sargent",u"http://www.gocomics.com/bensargent"),
#(u"Bill Day", u"http://www.gocomics.com/billday"),
#(u"Bill Schorr", u"http://www.gocomics.com/billschorr"),
#(u"Bob Englehart", u"http://www.gocomics.com/bobenglehart"),
#(u"Bob Gorrell",u"http://www.gocomics.com/bobgorrell"),
#(u"Brian Fairrington", u"http://www.gocomics.com/brianfairrington"),
#(u"Bruce Beattie", u"http://www.gocomics.com/brucebeattie"),
#(u"Cam Cardow", u"http://www.gocomics.com/camcardow"),
#(u"Chan Lowe",u"http://www.gocomics.com/chanlowe"),
#(u"Chip Bok",u"http://www.gocomics.com/chipbok"),
#(u"Chris Britt",u"http://www.gocomics.com/chrisbritt"),
#(u"Chuck Asay",u"http://www.gocomics.com/chuckasay"),
#(u"Clay Bennett",u"http://www.gocomics.com/claybennett"),
#(u"Clay Jones",u"http://www.gocomics.com/clayjones"),
#(u"Dan Wasserman",u"http://www.gocomics.com/danwasserman"),
#(u"Dana Summers",u"http://www.gocomics.com/danasummers"),
#(u"Daryl Cagle", u"http://www.gocomics.com/darylcagle"),
#(u"David Fitzsimmons", u"http://www.gocomics.com/davidfitzsimmons"),
#(u"Dick Locher",u"http://www.gocomics.com/dicklocher"),
#(u"Don Wright",u"http://www.gocomics.com/donwright"),
#(u"Donna Barstow",u"http://www.gocomics.com/donnabarstow"),
#(u"Drew Litton", u"http://www.gocomics.com/drewlitton"),
#(u"Drew Sheneman",u"http://www.gocomics.com/drewsheneman"),
#(u"Ed Stein", u"http://www.gocomics.com/edstein"),
#(u"Eric Allie", u"http://www.gocomics.com/ericallie"),
#(u"Gary Markstein", u"http://www.gocomics.com/garymarkstein"),
#(u"Gary McCoy", u"http://www.gocomics.com/garymccoy"),
#(u"Gary Varvel", u"http://www.gocomics.com/garyvarvel"),
#(u"Glenn McCoy",u"http://www.gocomics.com/glennmccoy"),
#(u"Henry Payne", u"http://www.gocomics.com/henrypayne"),
#(u"Jack Ohman",u"http://www.gocomics.com/jackohman"),
#(u"JD Crowe", u"http://www.gocomics.com/jdcrowe"),
#(u"Jeff Danziger",u"http://www.gocomics.com/jeffdanziger"),
#(u"Jeff Parker", u"http://www.gocomics.com/jeffparker"),
#(u"Jeff Stahler", u"http://www.gocomics.com/jeffstahler"),
#(u"Jerry Holbert", u"http://www.gocomics.com/jerryholbert"),
#(u"Jim Morin",u"http://www.gocomics.com/jimmorin"),
#(u"Joel Pett",u"http://www.gocomics.com/joelpett"),
#(u"John Cole", u"http://www.gocomics.com/johncole"),
#(u"John Darkow", u"http://www.gocomics.com/johndarkow"),
#(u"John Deering",u"http://www.gocomics.com/johndeering"),
#(u"John Sherffius", u"http://www.gocomics.com/johnsherffius"),
#(u"Ken Catalino",u"http://www.gocomics.com/kencatalino"),
#(u"Kerry Waghorn",u"http://www.gocomics.com/facesinthenews"),
#(u"Kevin Kallaugher",u"http://www.gocomics.com/kevinkallaugher"),
#(u"Lalo Alcaraz",u"http://www.gocomics.com/laloalcaraz"),
#(u"Larry Wright", u"http://www.gocomics.com/larrywright"),
#(u"Lisa Benson", u"http://www.gocomics.com/lisabenson"),
#(u"Marshall Ramsey", u"http://www.gocomics.com/marshallramsey"),
#(u"Matt Bors", u"http://www.gocomics.com/mattbors"),
#(u"Matt Davies",u"http://www.gocomics.com/mattdavies"),
#(u"Michael Ramirez", u"http://www.gocomics.com/michaelramirez"),
#(u"Mike Keefe", u"http://www.gocomics.com/mikekeefe"),
#(u"Mike Luckovich", u"http://www.gocomics.com/mikeluckovich"),
#(u"MIke Thompson", u"http://www.gocomics.com/mikethompson"),
#(u"Monte Wolverton", u"http://www.gocomics.com/montewolverton"),
#(u"Mr. Fish", u"http://www.gocomics.com/mrfish"),
#(u"Nate Beeler", u"http://www.gocomics.com/natebeeler"),
#(u"Nick Anderson", u"http://www.gocomics.com/nickanderson"),
#(u"Pat Bagley", u"http://www.gocomics.com/patbagley"),
#(u"Pat Oliphant",u"http://www.gocomics.com/patoliphant"),
#(u"Paul Conrad",u"http://www.gocomics.com/paulconrad"),
#(u"Paul Szep", u"http://www.gocomics.com/paulszep"),
#(u"RJ Matson", u"http://www.gocomics.com/rjmatson"),
#(u"Rob Rogers", u"http://www.gocomics.com/robrogers"),
#(u"Robert Ariail", u"http://www.gocomics.com/robertariail"),
#(u"Scott Stantis", u"http://www.gocomics.com/scottstantis"),
#(u"Signe Wilkinson", u"http://www.gocomics.com/signewilkinson"),
#(u"Small World",u"http://www.gocomics.com/smallworld"),
#(u"Steve Benson", u"http://www.gocomics.com/stevebenson"),
#(u"Steve Breen", u"http://www.gocomics.com/stevebreen"),
#(u"Steve Kelley", u"http://www.gocomics.com/stevekelley"),
#(u"Steve Sack", u"http://www.gocomics.com/stevesack"),
#(u"Stuart Carlson",u"http://www.gocomics.com/stuartcarlson"),
#(u"Ted Rall",u"http://www.gocomics.com/tedrall"),
#(u"(Th)ink", u"http://www.gocomics.com/think"),
#(u"Tom Toles",u"http://www.gocomics.com/tomtoles"),
#(u"Tony Auth",u"http://www.gocomics.com/tonyauth"),
#(u"Views of the World",u"http://www.gocomics.com/viewsoftheworld"),
#(u"ViewsAfrica",u"http://www.gocomics.com/viewsafrica"),
#(u"ViewsAmerica",u"http://www.gocomics.com/viewsamerica"),
#(u"ViewsAsia",u"http://www.gocomics.com/viewsasia"),
#(u"ViewsBusiness",u"http://www.gocomics.com/viewsbusiness"),
#(u"ViewsEurope",u"http://www.gocomics.com/viewseurope"),
#(u"ViewsLatinAmerica",u"http://www.gocomics.com/viewslatinamerica"),
#(u"ViewsMidEast",u"http://www.gocomics.com/viewsmideast"),
#(u"Walt Handelsman",u"http://www.gocomics.com/walthandelsman"),
#(u"Wayne Stayskal",u"http://www.gocomics.com/waynestayskal"),
#(u"Wit of the World",u"http://www.gocomics.com/witoftheworld"),
]:
print 'Working on: ', title
articles = self.make_links(url) articles = self.make_links(url)
if articles: if articles:
feeds.append((title, articles)) feeds.append((title, articles))
return feeds return feeds
def make_links(self, url): def make_links(self, url):
soup = self.index_to_soup(url) title = 'Temp'
# print 'soup: ', soup
title = ''
current_articles = [] current_articles = []
from datetime import datetime, timedelta pages = range(1, self.num_comics_to_get+1)
now = datetime.now() for page in pages:
dates = [(now-timedelta(days=d)).strftime('%Y/%m/%d') for d in range(self.num_comics_to_get)] page_soup = self.index_to_soup(url)
if page_soup:
for page in dates: try:
page_url = url + '/' + str(page) strip_title = page_soup.find(name='div', attrs={'class':'top'}).h1.a.string
print(page_url) except:
soup = self.index_to_soup(page_url) strip_title = 'Error - no Title found'
if soup: try:
strip_tag = self.tag_to_string(soup.find('a')) date_title = page_soup.find('ul', attrs={'class': 'feature-nav'}).li.string
if strip_tag: if not date_title:
print 'strip_tag: ', strip_tag date_title = page_soup.find('ul', attrs={'class': 'feature-nav'}).li.string
title = strip_tag except:
print 'title: ', title date_title = 'Error - no Date found'
title = strip_title + ' - ' + date_title
for i in range(2):
try:
strip_url_date = page_soup.find(name='div', attrs={'class':'top'}).h1.a['href']
break # success - this is normal exit
except:
strip_url_date = None
continue # try to get strip_url_date again
for i in range(2):
try:
prev_strip_url_date = page_soup.find('a', attrs={'class': 'prev'})['href']
break # success - this is normal exit
except:
prev_strip_url_date = None
continue # try to get prev_strip_url_date again
if strip_url_date:
page_url = 'http://www.gocomics.com' + strip_url_date
else:
continue
if prev_strip_url_date:
prev_page_url = 'http://www.gocomics.com' + prev_strip_url_date
else:
continue
current_articles.append({'title': title, 'url': page_url, 'description':'', 'date':''}) current_articles.append({'title': title, 'url': page_url, 'description':'', 'date':''})
url = prev_page_url
current_articles.reverse() current_articles.reverse()
return current_articles return current_articles
def preprocess_html(self, soup):
if soup.title:
title_string = soup.title.string.strip()
_cd = title_string.split(',',1)[1]
comic_date = ' '.join(_cd.split(' ', 4)[0:-1])
if soup.h1.span:
artist = soup.h1.span.string
soup.h1.span.string.replaceWith(comic_date + artist)
feature_item = soup.find('p',attrs={'class':'feature_item'})
if feature_item.a:
a_tag = feature_item.a
a_href = a_tag["href"]
img_tag = a_tag.img
img_tag["src"] = a_href
img_tag["width"] = self.comic_size
img_tag["height"] = None
return self.adeify_images(soup)
extra_css = ''' extra_css = '''
h1{font-family:Arial,Helvetica,sans-serif; font-weight:bold;font-size:large;} h1{font-family:Arial,Helvetica,sans-serif; font-weight:bold;font-size:large;}
h2{font-family:Arial,Helvetica,sans-serif; font-weight:normal;font-size:small;} h2{font-family:Arial,Helvetica,sans-serif; font-weight:normal;font-size:small;}
img {max-width:100%; min-width:100%;}
p{font-family:Arial,Helvetica,sans-serif;font-size:small;} p{font-family:Arial,Helvetica,sans-serif;font-size:small;}
body{font-family:Helvetica,Arial,sans-serif;font-size:small;} body{font-family:Helvetica,Arial,sans-serif;font-size:small;}
''' '''

View File

@ -1,16 +1,61 @@
import re
from calibre.web.feeds.news import BasicNewsRecipe from calibre.web.feeds.news import BasicNewsRecipe
class Handelsblatt(BasicNewsRecipe): class Handelsblatt(BasicNewsRecipe):
title = u'Handelsblatt' title = u'Handelsblatt'
__author__ = 'malfi' __author__ = 'malfi' # modified by Hegi, last change 2013-05-20
oldest_article = 7 description = u'Handelsblatt - basierend auf den RRS-Feeds von Handelsblatt.de'
tags = 'Nachrichten, Blog, Wirtschaft'
publisher = 'Verlagsgruppe Handelsblatt GmbH'
category = 'business, economy, news, Germany'
publication_type = 'daily newspaper'
language = 'de_DE'
oldest_article = 7
max_articles_per_feed = 100 max_articles_per_feed = 100
no_stylesheets = True simultaneous_downloads= 20
# cover_url = 'http://www.handelsblatt.com/images/logo/logo_handelsblatt.com.png'
language = 'de'
remove_tags_before = dict(attrs={'class':'hcf-overline'}) auto_cleanup = False
remove_tags_after = dict(attrs={'class':'hcf-footer'}) no_stylesheets = True
remove_javascript = True
remove_empty_feeds = True
# don't duplicate articles from "Schlagzeilen" / "Exklusiv" to other rubrics
ignore_duplicate_articles = {'title', 'url'}
# if you want to reduce size for an b/w or E-ink device, uncomment this:
# compress_news_images = True
# compress_news_images_auto_size = 16
# scale_news_images = (400,300)
timefmt = ' [%a, %d %b %Y]'
conversion_options = {'smarten_punctuation' : True,
'authors' : publisher,
'publisher' : publisher}
language = 'de_DE'
encoding = 'UTF-8'
cover_source = 'http://www.handelsblatt-shop.com/epaper/482/'
# masthead_url = 'http://www.handelsblatt.com/images/hb_logo/6543086/1-format3.jpg'
masthead_url = 'http://www.handelsblatt-chemie.de/wp-content/uploads/2012/01/hb-logo.gif'
def get_cover_url(self):
cover_source_soup = self.index_to_soup(self.cover_source)
preview_image_div = cover_source_soup.find(attrs={'class':'vorschau'})
return 'http://www.handelsblatt-shop.com'+preview_image_div.a.img['src']
# remove_tags_before = dict(attrs={'class':'hcf-overline'})
# remove_tags_after = dict(attrs={'class':'hcf-footer'})
# Alternatively use this:
keep_only_tags = [
dict(name='div', attrs={'class':['hcf-column hcf-column1 hcf-teasercontainer hcf-maincol']}),
dict(name='div', attrs={'id':['contentMain']})
]
remove_tags = [
dict(name='div', attrs={'class':['hcf-link-block hcf-faq-open', 'hcf-article-related']})
]
feeds = [ feeds = [
(u'Handelsblatt Exklusiv',u'http://www.handelsblatt.com/rss/exklusiv'), (u'Handelsblatt Exklusiv',u'http://www.handelsblatt.com/rss/exklusiv'),
@ -25,15 +70,19 @@ class Handelsblatt(BasicNewsRecipe):
(u'Handelsblatt Weblogs',u'http://www.handelsblatt.com/rss/blogs') (u'Handelsblatt Weblogs',u'http://www.handelsblatt.com/rss/blogs')
] ]
extra_css = ''' # Insert ". " after "Place" in <span class="hcf-location-mark">Place</span>
h1{font-family:Arial,Helvetica,sans-serif; font-weight:bold;font-size:large;} # If you use .epub format you could also do this as extra_css '.hcf-location-mark:after {content: ". "}'
h2{font-family:Arial,Helvetica,sans-serif; font-weight:normal;font-size:small;} preprocess_regexps = [(re.compile(r'(<span class="hcf-location-mark">[^<]*)(</span>)',
p{font-family:Arial,Helvetica,sans-serif;font-size:small;} re.DOTALL|re.IGNORECASE), lambda match: match.group(1) + '. ' + match.group(2))]
body{font-family:Helvetica,Arial,sans-serif;font-size:small;}
''' extra_css = 'h1 {font-size: 1.6em; text-align: left} \
h2 {font-size: 1em; font-style: italic; font-weight: normal} \
h3 {font-size: 1.3em;text-align: left} \
h4, h5, h6, a {font-size: 1em;text-align: left} \
.hcf-caption {font-size: 1em;text-align: left; font-style: italic} \
.hcf-location-mark {font-style: italic}'
def print_version(self, url): def print_version(self, url):
url = url.split('/') main, sep, id = url.rpartition('/')
url[-1] = 'v_detail_tab_print,'+url[-1] return main + '/v_detail_tab_print/' + id
url = '/'.join(url)
return url

View File

@ -9,21 +9,24 @@ class AdvancedUserRecipe1274742400(BasicNewsRecipe):
oldest_article = 7 oldest_article = 7
max_articles_per_feed = 100 max_articles_per_feed = 100
keep_only_tags = [dict(id='content-main')] #keep_only_tags = [dict(id='content-main')]
remove_tags = [dict(id=['right-col-content', 'trending-topics']), #remove_tags = [dict(id=['right-col-content', 'trending-topics']),
{'class':['ppy-outer']} #{'class':['ppy-outer']}
] #]
no_stylesheets = True no_stylesheets = True
use_embedded_content = False
auto_cleanup = True
feeds = [ feeds = [
(u'News', u'http://www.lvrj.com/news.rss'), (u'News', u'http://www.lvrj.com/news.rss'),
(u'Business', u'http://www.lvrj.com/business.rss'), (u'Business', u'http://www.lvrj.com/business.rss'),
(u'Living', u'http://www.lvrj.com/living.rss'), (u'Living', u'http://www.lvrj.com/living.rss'),
(u'Opinion', u'http://www.lvrj.com/opinion.rss'), (u'Opinion', u'http://www.lvrj.com/opinion.rss'),
(u'Neon', u'http://www.lvrj.com/neon.rss'), (u'Neon', u'http://www.lvrj.com/neon.rss'),
(u'Image', u'http://www.lvrj.com/image.rss'), #(u'Image', u'http://www.lvrj.com/image.rss'),
(u'Home & Garden', u'http://www.lvrj.com/home_and_garden.rss'), #(u'Home & Garden', u'http://www.lvrj.com/home_and_garden.rss'),
(u'Furniture & Design', u'http://www.lvrj.com/furniture_and_design.rss'), #(u'Furniture & Design', u'http://www.lvrj.com/furniture_and_design.rss'),
(u'Drive', u'http://www.lvrj.com/drive.rss'), #(u'Drive', u'http://www.lvrj.com/drive.rss'),
(u'Real Estate', u'http://www.lvrj.com/real_estate.rss'), #(u'Real Estate', u'http://www.lvrj.com/real_estate.rss'),
(u'Sports', u'http://www.lvrj.com/sports.rss')] (u'Sports', u'http://www.lvrj.com/sports.rss')]

View File

@ -4,7 +4,7 @@ class AdvancedUserRecipe1306061239(BasicNewsRecipe):
title = u'New Musical Express Magazine' title = u'New Musical Express Magazine'
description = 'Author D.Asbury. UK Rock & Pop Mag. ' description = 'Author D.Asbury. UK Rock & Pop Mag. '
__author__ = 'Dave Asbury' __author__ = 'Dave Asbury'
# last updated 7/10/12 # last updated 17/5/13 News feed url altered
remove_empty_feeds = True remove_empty_feeds = True
remove_javascript = True remove_javascript = True
no_stylesheets = True no_stylesheets = True
@ -13,62 +13,57 @@ class AdvancedUserRecipe1306061239(BasicNewsRecipe):
#auto_cleanup = True #auto_cleanup = True
language = 'en_GB' language = 'en_GB'
compress_news_images = True compress_news_images = True
def get_cover_url(self): def get_cover_url(self):
soup = self.index_to_soup('http://www.nme.com/component/subscribe') soup = self.index_to_soup('http://www.nme.com/component/subscribe')
cov = soup.find(attrs={'id' : 'magazine_cover'}) cov = soup.find(attrs={'id' : 'magazine_cover'})
cov2 = str(cov['src']) cov2 = str(cov['src'])
# print '**** Cov url =*', cover_url,'***' # print '**** Cov url =*', cover_url,'***'
#print '**** Cov url =*','http://www.magazinesdirect.com/article_images/articledir_3138/1569221/1_largelisting.jpg','***' #print '**** Cov url =*','http://www.magazinesdirect.com/article_images/articledir_3138/1569221/1_largelisting.jpg','***'
br = browser() br = browser()
br.set_handle_redirect(False) br.set_handle_redirect(False)
try: try:
br.open_novisit(cov2) br.open_novisit(cov2)
cover_url = str(cov2) cover_url = str(cov2)
except: except:
cover_url = 'http://tawanda3000.files.wordpress.com/2011/02/nme-logo.jpg' cover_url = 'http://tawanda3000.files.wordpress.com/2011/02/nme-logo.jpg'
return cover_url return cover_url
masthead_url = 'http://tawanda3000.files.wordpress.com/2011/02/nme-logo.jpg' masthead_url = 'http://tawanda3000.files.wordpress.com/2011/02/nme-logo.jpg'
remove_tags = [ remove_tags = [
dict( attrs={'class':'clear_icons'}), dict(attrs={'class':'clear_icons'}),
dict( attrs={'class':'share_links'}), dict(attrs={'class':'share_links'}),
dict( attrs={'id':'right_panel'}), dict(attrs={'id':'right_panel'}),
dict( attrs={'class':'today box'}), dict(attrs={'class':'today box'}),
] ]
keep_only_tags = [ keep_only_tags = [
dict(name='h1'), dict(name='h1'),
#dict(name='h3'), #dict(name='h3'),
dict(attrs={'class' : 'BText'}), dict(attrs={'class' : 'BText'}),
dict(attrs={'class' : 'Bmore'}), dict(attrs={'class' : 'Bmore'}),
dict(attrs={'class' : 'bPosts'}), dict(attrs={'class' : 'bPosts'}),
dict(attrs={'class' : 'text'}), dict(attrs={'class' : 'text'}),
dict(attrs={'id' : 'article_gallery'}), dict(attrs={'id' : 'article_gallery'}),
#dict(attrs={'class' : 'image'}), #dict(attrs={'class' : 'image'}),
dict(attrs={'class' : 'article_text'}) dict(attrs={'class' : 'article_text'})
]
feeds = [
(u'NME News', u'http://feeds.feedburner.com/nmecom/rss/newsxml?format=xml'),
#(u'Reviews', u'http://feeds2.feedburner.com/nme/SdML'),
(u'Reviews',u'http://feed43.com/1817687144061333.xml'),
(u'Bloggs',u'http://feed43.com/3326754333186048.xml'),
] ]
feeds = [
(u'NME News', u'http://www.nme.com/news?alt=rss' ), #http://feeds.feedburner.com/nmecom/rss/newsxml?format=xml'),
#(u'Reviews', u'http://feeds2.feedburner.com/nme/SdML'),
(u'Reviews',u'http://feed43.com/1817687144061333.xml'),
(u'Bloggs',u'http://feed43.com/3326754333186048.xml'),
]
extra_css = ''' extra_css = '''
h1{font-family:Arial,Helvetica,sans-serif; font-weight:bold;font-size:large;} h1{font-family:Arial,Helvetica,sans-serif; font-weight:bold;font-size:large;}
h2{font-family:Arial,Helvetica,sans-serif; font-weight:normal;font-size:small;} h2{font-family:Arial,Helvetica,sans-serif; font-weight:normal;font-size:small;}
p{font-family:Arial,Helvetica,sans-serif;font-size:small;} p{font-family:Arial,Helvetica,sans-serif;font-size:small;}
body{font-family:Helvetica,Arial,sans-serif;font-size:small;} body{font-family:Helvetica,Arial,sans-serif;font-size:small;}
''' '''

75
recipes/nrc_next.recipe Normal file
View File

@ -0,0 +1,75 @@
#!/usr/bin/env python2
# -*- coding: utf-8 -*-
# Based on veezh's original recipe, Kovid Goyal's New York Times recipe and Snaabs nrc Handelsblad recipe
__license__ = 'GPL v3'
__copyright__ = '2013, Niels Giesen'
'''
www.nrc.nl
'''
import os, zipfile
import time
from calibre.web.feeds.news import BasicNewsRecipe
from calibre.ptempfile import PersistentTemporaryFile
class NRCNext(BasicNewsRecipe):
title = u'nrc•next'
description = u'De ePaper-versie van nrc•next'
language = 'nl'
lang = 'nl-NL'
needs_subscription = True
__author__ = 'Niels Giesen'
conversion_options = {
'no_default_epub_cover' : True
}
def get_browser(self):
br = BasicNewsRecipe.get_browser(self)
if self.username is not None and self.password is not None:
br.open('http://login.nrc.nl/login')
br.select_form(nr=0)
br['username'] = self.username
br['password'] = self.password
br.submit()
return br
def build_index(self):
today = time.strftime("%Y%m%d")
domain = "http://digitaleeditie.nrc.nl"
url = domain + "/digitaleeditie/helekrant/epub/nn_" + today + ".epub"
#print url
try:
br = self.get_browser()
f = br.open(url)
except:
self.report_progress(0,_('Kan niet inloggen om editie te downloaden'))
raise ValueError('Krant van vandaag nog niet beschikbaar')
tmp = PersistentTemporaryFile(suffix='.epub')
self.report_progress(0,_('downloading epub'))
tmp.write(f.read())
f.close()
br.close()
if zipfile.is_zipfile(tmp):
try:
zfile = zipfile.ZipFile(tmp.name, 'r')
zfile.extractall(self.output_dir)
self.report_progress(0,_('extracting epub'))
except zipfile.BadZipfile:
self.report_progress(0,_('BadZip error, continuing'))
tmp.close()
index = os.path.join(self.output_dir, 'metadata.opf')
self.report_progress(1,_('epub downloaded and extracted'))
return index

View File

@ -1,11 +1,9 @@
__license__ = 'GPL v3' __license__ = 'GPL v3'
__copyright__ = '2012, Darko Miletic <darko.miletic at gmail.com>' __copyright__ = '2012-2013, Darko Miletic <darko.miletic at gmail.com>'
''' '''
www.nsfwcorp.com www.nsfwcorp.com
''' '''
import urllib
from calibre.web.feeds.news import BasicNewsRecipe from calibre.web.feeds.news import BasicNewsRecipe
class NotSafeForWork(BasicNewsRecipe): class NotSafeForWork(BasicNewsRecipe):
@ -20,8 +18,8 @@ class NotSafeForWork(BasicNewsRecipe):
needs_subscription = True needs_subscription = True
auto_cleanup = False auto_cleanup = False
INDEX = 'https://www.nsfwcorp.com' INDEX = 'https://www.nsfwcorp.com'
LOGIN = INDEX + '/login/target/' LOGIN = INDEX + '/account/login/?next=%2F'
SETTINGS = INDEX + '/settings/' SETTINGS = INDEX + '/account/settings/'
use_embedded_content = True use_embedded_content = True
language = 'en' language = 'en'
publication_type = 'magazine' publication_type = 'magazine'
@ -48,19 +46,20 @@ class NotSafeForWork(BasicNewsRecipe):
def get_browser(self): def get_browser(self):
br = BasicNewsRecipe.get_browser(self) br = BasicNewsRecipe.get_browser(self)
br.open(self.LOGIN) br.open(self.INDEX)
if self.username is not None and self.password is not None: if self.username is not None and self.password is not None:
data = urllib.urlencode({ 'email':self.username br.open(self.LOGIN)
,'password':self.password br.select_form(nr=0)
}) br['email' ] = self.username
br.open(self.LOGIN, data) br['password'] = self.password
br.submit()
return br return br
def get_feeds(self): def get_feeds(self):
self.feeds = [] self.feeds = []
soup = self.index_to_soup(self.SETTINGS) soup = self.index_to_soup(self.SETTINGS)
for item in soup.findAll('input', attrs={'type':'text'}): for item in soup.findAll('input', attrs={'type':'text'}):
if item.has_key('value') and item['value'].startswith('http://www.nsfwcorp.com/feed/'): if item.has_key('value') and item['value'].startswith('https://www.nsfwcorp.com/feed/'):
self.feeds.append(item['value']) self.feeds.append(item['value'])
return self.feeds return self.feeds
return self.feeds return self.feeds

View File

@ -26,14 +26,14 @@ class DailyTelegraph(BasicNewsRecipe):
keep_only_tags = [dict(name='div', attrs={'id': 'story'})] keep_only_tags = [dict(name='div', attrs={'id': 'story'})]
#remove_tags = [dict(name=['object','link'])] # remove_tags = [dict(name=['object','link'])]
remove_tags = [dict(name ='div', attrs = {'class': 'story-info'}), remove_tags = [dict(name='div', attrs={'class': 'story-info'}),
dict(name ='div', attrs = {'class': 'story-header-tools'}), dict(name='div', attrs={'class': 'story-header-tools'}),
dict(name ='div', attrs = {'class': 'story-sidebar'}), dict(name='div', attrs={'class': 'story-sidebar'}),
dict(name ='div', attrs = {'class': 'story-footer'}), dict(name='div', attrs={'class': 'story-footer'}),
dict(name ='div', attrs = {'id': 'comments'}), dict(name='div', attrs={'id': 'comments'}),
dict(name ='div', attrs = {'class': 'story-extras story-extras-2'}), dict(name='div', attrs={'class': 'story-extras story-extras-2'}),
dict(name ='div', attrs = {'class': 'group item-count-1 story-related'}) dict(name='div', attrs={'class': 'group item-count-1 story-related'})
] ]
extra_css = ''' extra_css = '''
@ -45,30 +45,31 @@ class DailyTelegraph(BasicNewsRecipe):
.caption{font-family:Trebuchet MS,Trebuchet,Helvetica,sans-serif; font-size: xx-small;} .caption{font-family:Trebuchet MS,Trebuchet,Helvetica,sans-serif; font-size: xx-small;}
''' '''
feeds = [ (u'News', u'http://feeds.news.com.au/public/rss/2.0/aus_news_807.xml'), feeds = [
(u'Opinion', u'http://feeds.news.com.au/public/rss/2.0/aus_opinion_58.xml'), (u'News', u'http://feeds.news.com.au/public/rss/2.0/aus_news_807.xml'),
(u'The Nation', u'http://feeds.news.com.au/public/rss/2.0/aus_the_nation_62.xml'), (u'Opinion', u'http://feeds.news.com.au/public/rss/2.0/aus_opinion_58.xml'),
(u'World News', u'http://feeds.news.com.au/public/rss/2.0/aus_world_808.xml'), (u'The Nation', u'http://feeds.news.com.au/public/rss/2.0/aus_the_nation_62.xml'),
(u'US Election', u'http://feeds.news.com.au/public/rss/2.0/aus_uselection_687.xml'), (u'World News', u'http://feeds.news.com.au/public/rss/2.0/aus_world_808.xml'),
(u'Climate', u'http://feeds.news.com.au/public/rss/2.0/aus_climate_809.xml'), (u'US Election', u'http://feeds.news.com.au/public/rss/2.0/aus_uselection_687.xml'),
(u'Media', u'http://feeds.news.com.au/public/rss/2.0/aus_media_57.xml'), (u'Climate', u'http://feeds.news.com.au/public/rss/2.0/aus_climate_809.xml'),
(u'IT', u'http://feeds.news.com.au/public/rss/2.0/ausit_itnews_topstories_367.xml'), (u'Media', u'http://feeds.news.com.au/public/rss/2.0/aus_media_57.xml'),
(u'Exec Tech', u'http://feeds.news.com.au/public/rss/2.0/ausit_exec_topstories_385.xml'), (u'IT', u'http://feeds.news.com.au/public/rss/2.0/ausit_itnews_topstories_367.xml'),
(u'Higher Education', u'http://feeds.news.com.au/public/rss/2.0/aus_higher_education_56.xml'), (u'Exec Tech', u'http://feeds.news.com.au/public/rss/2.0/ausit_exec_topstories_385.xml'),
(u'Arts', u'http://feeds.news.com.au/public/rss/2.0/aus_arts_51.xml'), (u'Higher Education', u'http://feeds.news.com.au/public/rss/2.0/aus_higher_education_56.xml'),
(u'Travel', u'http://feeds.news.com.au/public/rss/2.0/aus_travel_and_indulgence_63.xml'), (u'Arts', u'http://feeds.news.com.au/public/rss/2.0/aus_arts_51.xml'),
(u'Property', u'http://feeds.news.com.au/public/rss/2.0/aus_property_59.xml'), (u'Travel', u'http://feeds.news.com.au/public/rss/2.0/aus_travel_and_indulgence_63.xml'),
(u'Sport', u'http://feeds.news.com.au/public/rss/2.0/aus_sport_61.xml'), (u'Property', u'http://feeds.news.com.au/public/rss/2.0/aus_property_59.xml'),
(u'Business', u'http://feeds.news.com.au/public/rss/2.0/aus_business_811.xml'), (u'Sport', u'http://feeds.news.com.au/public/rss/2.0/aus_sport_61.xml'),
(u'Aviation', u'http://feeds.news.com.au/public/rss/2.0/aus_business_aviation_706.xml'), (u'Business', u'http://feeds.news.com.au/public/rss/2.0/aus_business_811.xml'),
(u'Commercial Property', u'http://feeds.news.com.au/public/rss/2.0/aus_business_commercial_property_708.xml'), (u'Aviation', u'http://feeds.news.com.au/public/rss/2.0/aus_business_aviation_706.xml'),
(u'Mining', u'http://feeds.news.com.au/public/rss/2.0/aus_business_mining_704.xml')] (u'Commercial Property', u'http://feeds.news.com.au/public/rss/2.0/aus_business_commercial_property_708.xml'),
(u'Mining', u'http://feeds.news.com.au/public/rss/2.0/aus_business_mining_704.xml')]
def get_browser(self): def get_browser(self):
br = BasicNewsRecipe.get_browser(self) br = BasicNewsRecipe.get_browser(self)
if self.username and self.password: if self.username and self.password:
br.open('http://www.theaustralian.com.au') br.open('http://www.theaustralian.com.au')
br.select_form(nr=0) br.select_form(nr=1)
br['username'] = self.username br['username'] = self.username
br['password'] = self.password br['password'] = self.password
raw = br.submit().read() raw = br.submit().read()
@ -80,10 +81,11 @@ class DailyTelegraph(BasicNewsRecipe):
def get_article_url(self, article): def get_article_url(self, article):
return article.id return article.id
#br = self.get_browser() # br = self.get_browser()
#br.open(article.link).read() # br.open(article.link).read()
#print br.geturl() # print br.geturl()
# return br.geturl()
#return br.geturl()

View File

@ -3,7 +3,7 @@ __license__ = 'GPL v3'
__copyright__ = '4 February 2011, desUBIKado' __copyright__ = '4 February 2011, desUBIKado'
__author__ = 'desUBIKado' __author__ = 'desUBIKado'
__version__ = 'v0.09' __version__ = 'v0.09'
__date__ = '02, December 2012' __date__ = '14, May 2013'
''' '''
http://www.weblogssl.com/ http://www.weblogssl.com/
''' '''
@ -56,15 +56,16 @@ class weblogssl(BasicNewsRecipe):
,(u'Zona FandoM', u'http://feeds.weblogssl.com/zonafandom') ,(u'Zona FandoM', u'http://feeds.weblogssl.com/zonafandom')
,(u'Fandemia', u'http://feeds.weblogssl.com/fandemia') ,(u'Fandemia', u'http://feeds.weblogssl.com/fandemia')
,(u'Tendencias', u'http://feeds.weblogssl.com/trendencias') ,(u'Tendencias', u'http://feeds.weblogssl.com/trendencias')
,(u'Beb\xe9s y m\xe1s', u'http://feeds.weblogssl.com/bebesymas') ,(u'Tendencias Belleza', u'http://feeds.weblogssl.com/trendenciasbelleza')
,(u'Tendencias Hombre', u'http://feeds.weblogssl.com/trendenciashombre')
,(u'Tendencias Shopping', u'http://feeds.weblogssl.com/trendenciasshopping')
,(u'Directo al paladar', u'http://feeds.weblogssl.com/directoalpaladar') ,(u'Directo al paladar', u'http://feeds.weblogssl.com/directoalpaladar')
,(u'Compradicci\xf3n', u'http://feeds.weblogssl.com/compradiccion') ,(u'Compradicci\xf3n', u'http://feeds.weblogssl.com/compradiccion')
,(u'Decoesfera', u'http://feeds.weblogssl.com/decoesfera') ,(u'Decoesfera', u'http://feeds.weblogssl.com/decoesfera')
,(u'Embelezzia', u'http://feeds.weblogssl.com/embelezzia') ,(u'Embelezzia', u'http://feeds.weblogssl.com/embelezzia')
,(u'Vit\xf3nica', u'http://feeds.weblogssl.com/vitonica') ,(u'Vit\xf3nica', u'http://feeds.weblogssl.com/vitonica')
,(u'Ambiente G', u'http://feeds.weblogssl.com/ambienteg') ,(u'Ambiente G', u'http://feeds.weblogssl.com/ambienteg')
,(u'Tendencias Belleza', u'http://feeds.weblogssl.com/trendenciasbelleza') ,(u'Beb\xe9s y m\xe1s', u'http://feeds.weblogssl.com/bebesymas')
,(u'Tendencias Hombre', u'http://feeds.weblogssl.com/trendenciashombre')
,(u'Peques y m\xe1s', u'http://feeds.weblogssl.com/pequesymas') ,(u'Peques y m\xe1s', u'http://feeds.weblogssl.com/pequesymas')
,(u'Motorpasi\xf3n', u'http://feeds.weblogssl.com/motorpasion') ,(u'Motorpasi\xf3n', u'http://feeds.weblogssl.com/motorpasion')
,(u'Motorpasi\xf3n F1', u'http://feeds.weblogssl.com/motorpasionf1') ,(u'Motorpasi\xf3n F1', u'http://feeds.weblogssl.com/motorpasionf1')
@ -90,7 +91,7 @@ class weblogssl(BasicNewsRecipe):
dict(name='section' , attrs={'class':'comments'}), #m.xataka.com dict(name='section' , attrs={'class':'comments'}), #m.xataka.com
dict(name='div' , attrs={'class':'article-comments'}), #m.xataka.com dict(name='div' , attrs={'class':'article-comments'}), #m.xataka.com
dict(name='nav' , attrs={'class':'article-taxonomy'}) #m.xataka.com dict(name='nav' , attrs={'class':'article-taxonomy'}) #m.xataka.com
] ]
remove_tags_after = dict(name='section' , attrs={'class':'comments'}) remove_tags_after = dict(name='section' , attrs={'class':'comments'})
@ -119,23 +120,6 @@ class weblogssl(BasicNewsRecipe):
return soup return soup
# Para obtener la url original del articulo a partir de la de "feedsportal"
# El siguiente código es gracias al usuario "bosplans" de www.mobileread.com
# http://www.mobileread.com/forums/showthread.php?t=130297
def get_article_url(self, article): def get_article_url(self, article):
link = article.get('link', None)
if link is None:
return article
# if link.split('/')[-4]=="xataka2":
# return article.get('feedburner_origlink', article.get('link', article.get('guid')))
if link.split('/')[-4]=="xataka2":
return article.get('guid', None) return article.get('guid', None)
if link.split('/')[-1]=="story01.htm":
link=link.split('/')[-2]
a=['0B','0C','0D','0E','0F','0G','0N' ,'0L0S','0A']
b=['.' ,'/' ,'?' ,'-' ,'=' ,'&' ,'.com','www.','0']
for i in range(0,len(a)):
link=link.replace(a[i],b[i])
link="http://"+link
return link

View File

@ -0,0 +1,86 @@
__license__ = 'GPL v3'
__copyright__ = '2013, Armin Geller'
'''
Fetch WirtschaftsWoche Online
'''
import re
# import time
from calibre.web.feeds.news import BasicNewsRecipe
class WirtschaftsWocheOnline(BasicNewsRecipe):
title = u'WirtschaftsWoche Online'
__author__ = 'Hegi' # Update AGE 2013-01-05; Modified by Hegi 2013-04-28
description = u'Wirtschaftswoche Online - basierend auf den RRS-Feeds von Wiwo.de'
tags = 'Nachrichten, Blog, Wirtschaft'
publisher = 'Verlagsgruppe Handelsblatt GmbH / Redaktion WirtschaftsWoche Online'
category = 'business, economy, news, Germany'
publication_type = 'weekly magazine'
language = 'de'
oldest_article = 7
max_articles_per_feed = 100
simultaneous_downloads= 20
auto_cleanup = False
no_stylesheets = True
remove_javascript = True
remove_empty_feeds = True
# don't duplicate articles from "Schlagzeilen" / "Exklusiv" to other rubrics
ignore_duplicate_articles = {'title', 'url'}
# if you want to reduce size for an b/w or E-ink device, uncomment this:
# compress_news_images = True
# compress_news_images_auto_size = 16
# scale_news_images = (400,300)
timefmt = ' [%a, %d %b %Y]'
conversion_options = {'smarten_punctuation' : True,
'authors' : publisher,
'publisher' : publisher}
language = 'de_DE'
encoding = 'UTF-8'
cover_source = 'http://www.wiwo-shop.de/wirtschaftswoche/wirtschaftswoche-emagazin-p1952.html'
masthead_url = 'http://www.wiwo.de/images/wiwo_logo/5748610/1-formatOriginal.png'
def get_cover_url(self):
cover_source_soup = self.index_to_soup(self.cover_source)
preview_image_div = cover_source_soup.find(attrs={'class':'container vorschau'})
return 'http://www.wiwo-shop.de'+preview_image_div.a.img['src']
# Insert ". " after "Place" in <span class="hcf-location-mark">Place</span>
# If you use .epub format you could also do this as extra_css '.hcf-location-mark:after {content: ". "}'
preprocess_regexps = [(re.compile(r'(<span class="hcf-location-mark">[^<]*)(</span>)',
re.DOTALL|re.IGNORECASE), lambda match: match.group(1) + '. ' + match.group(2))]
extra_css = 'h1 {font-size: 1.6em; text-align: left} \
h2 {font-size: 1em; font-style: italic; font-weight: normal} \
h3 {font-size: 1.3em;text-align: left} \
h4, h5, h6, a {font-size: 1em;text-align: left} \
.hcf-caption {font-size: 1em;text-align: left; font-style: italic} \
.hcf-location-mark {font-style: italic}'
keep_only_tags = [
dict(name='div', attrs={'class':['hcf-column hcf-column1 hcf-teasercontainer hcf-maincol']}),
dict(name='div', attrs={'id':['contentMain']})
]
remove_tags = [
dict(name='div', attrs={'class':['hcf-link-block hcf-faq-open', 'hcf-article-related']})
]
feeds = [
(u'Schlagzeilen', u'http://www.wiwo.de/contentexport/feed/rss/schlagzeilen'),
(u'Exklusiv', u'http://www.wiwo.de/contentexport/feed/rss/exklusiv'),
# (u'Themen', u'http://www.wiwo.de/contentexport/feed/rss/themen'), # AGE no print version
(u'Unternehmen', u'http://www.wiwo.de/contentexport/feed/rss/unternehmen'),
(u'Finanzen', u'http://www.wiwo.de/contentexport/feed/rss/finanzen'),
(u'Politik', u'http://www.wiwo.de/contentexport/feed/rss/politik'),
(u'Erfolg', u'http://www.wiwo.de/contentexport/feed/rss/erfolg'),
(u'Technologie', u'http://www.wiwo.de/contentexport/feed/rss/technologie'),
# (u'Green-WiWo', u'http://green.wiwo.de/feed/rss/') # AGE no print version
]
def print_version(self, url):
main, sep, id = url.rpartition('/')
return main + '/v_detail_tab_print/' + id

View File

@ -9,8 +9,9 @@ import copy
# http://online.wsj.com/page/us_in_todays_paper.html # http://online.wsj.com/page/us_in_todays_paper.html
def filter_classes(x): def filter_classes(x):
if not x: return False if not x:
bad_classes = {'sTools', 'printSummary', 'mostPopular', 'relatedCollection'} return False
bad_classes = {'articleInsetPoll', 'trendingNow', 'sTools', 'printSummary', 'mostPopular', 'relatedCollection'}
classes = frozenset(x.split()) classes = frozenset(x.split())
return len(bad_classes.intersection(classes)) > 0 return len(bad_classes.intersection(classes)) > 0
@ -42,14 +43,15 @@ class WallStreetJournal(BasicNewsRecipe):
remove_tags_before = dict(name='h1') remove_tags_before = dict(name='h1')
remove_tags = [ remove_tags = [
dict(id=["articleTabs_tab_article", dict(id=["articleTabs_tab_article",
"articleTabs_tab_comments", "articleTabs_tab_comments", 'msnLinkback', 'yahooLinkback',
'articleTabs_panel_comments', 'footer', 'articleTabs_panel_comments', 'footer', 'emailThisScrim', 'emailConfScrim', 'emailErrorScrim',
"articleTabs_tab_interactive", "articleTabs_tab_video", "articleTabs_tab_interactive", "articleTabs_tab_video",
"articleTabs_tab_map", "articleTabs_tab_slideshow", "articleTabs_tab_map", "articleTabs_tab_slideshow",
"articleTabs_tab_quotes", "articleTabs_tab_document", "articleTabs_tab_quotes", "articleTabs_tab_document",
"printModeAd", "aFbLikeAuth", "videoModule", "printModeAd", "aFbLikeAuth", "videoModule",
"mostRecommendations", "topDiscussions"]), "mostRecommendations", "topDiscussions"]),
{'class':['footer_columns','network','insetCol3wide','interactive','video','slideshow','map','insettip','insetClose','more_in', "insetContent", 'articleTools_bottom', 'aTools', "tooltip", "adSummary", "nav-inline"]}, {'class':['footer_columns','hidden', 'network','insetCol3wide','interactive','video','slideshow','map','insettip',
'insetClose','more_in', "insetContent", 'articleTools_bottom', 'aTools', "tooltip", "adSummary", "nav-inline"]},
dict(rel='shortcut icon'), dict(rel='shortcut icon'),
{'class':filter_classes}, {'class':filter_classes},
] ]
@ -74,7 +76,10 @@ class WallStreetJournal(BasicNewsRecipe):
for tag in soup.findAll(name=['table', 'tr', 'td']): for tag in soup.findAll(name=['table', 'tr', 'td']):
tag.name = 'div' tag.name = 'div'
for tag in soup.findAll('div', dict(id=["articleThumbnail_1", "articleThumbnail_2", "articleThumbnail_3", "articleThumbnail_4", "articleThumbnail_5", "articleThumbnail_6", "articleThumbnail_7"])): for tag in soup.findAll('div', dict(id=[
"articleThumbnail_1", "articleThumbnail_2", "articleThumbnail_3",
"articleThumbnail_4", "articleThumbnail_5", "articleThumbnail_6",
"articleThumbnail_7"])):
tag.extract() tag.extract()
return soup return soup
@ -92,7 +97,7 @@ class WallStreetJournal(BasicNewsRecipe):
except: except:
articles = [] articles = []
if articles: if articles:
feeds.append((title, articles)) feeds.append((title, articles))
return feeds return feeds
def abs_wsj_url(self, href): def abs_wsj_url(self, href):
@ -119,16 +124,16 @@ class WallStreetJournal(BasicNewsRecipe):
for a in div.findAll('a', href=lambda x: x and '/itp/' in x): for a in div.findAll('a', href=lambda x: x and '/itp/' in x):
pageone = a['href'].endswith('pageone') pageone = a['href'].endswith('pageone')
if pageone: if pageone:
title = 'Front Section' title = 'Front Section'
url = self.abs_wsj_url(a['href']) url = self.abs_wsj_url(a['href'])
feeds = self.wsj_add_feed(feeds,title,url) feeds = self.wsj_add_feed(feeds,title,url)
title = "What's News" title = "What's News"
url = url.replace('pageone','whatsnews') url = url.replace('pageone','whatsnews')
feeds = self.wsj_add_feed(feeds,title,url) feeds = self.wsj_add_feed(feeds,title,url)
else: else:
title = self.tag_to_string(a) title = self.tag_to_string(a)
url = self.abs_wsj_url(a['href']) url = self.abs_wsj_url(a['href'])
feeds = self.wsj_add_feed(feeds,title,url) feeds = self.wsj_add_feed(feeds,title,url)
return feeds return feeds
def wsj_find_wn_articles(self, url): def wsj_find_wn_articles(self, url):
@ -137,22 +142,22 @@ class WallStreetJournal(BasicNewsRecipe):
whats_news = soup.find('div', attrs={'class':lambda x: x and 'whatsNews-simple' in x}) whats_news = soup.find('div', attrs={'class':lambda x: x and 'whatsNews-simple' in x})
if whats_news is not None: if whats_news is not None:
for a in whats_news.findAll('a', href=lambda x: x and '/article/' in x): for a in whats_news.findAll('a', href=lambda x: x and '/article/' in x):
container = a.findParent(['p']) container = a.findParent(['p'])
meta = a.find(attrs={'class':'meta_sectionName'}) meta = a.find(attrs={'class':'meta_sectionName'})
if meta is not None: if meta is not None:
meta.extract() meta.extract()
title = self.tag_to_string(a).strip() title = self.tag_to_string(a).strip()
url = a['href'] url = a['href']
desc = '' desc = ''
if container is not None: if container is not None:
desc = self.tag_to_string(container) desc = self.tag_to_string(container)
articles.append({'title':title, 'url':url, articles.append({'title':title, 'url':url,
'description':desc, 'date':''}) 'description':desc, 'date':''})
self.log('\tFound WN article:', title) self.log('\tFound WN article:', title)
self.log('\t\t', desc) self.log('\t\t', desc)
return articles return articles
@ -161,18 +166,18 @@ class WallStreetJournal(BasicNewsRecipe):
whats_news = soup.find('div', attrs={'class':lambda x: x and 'whatsNews-simple' in x}) whats_news = soup.find('div', attrs={'class':lambda x: x and 'whatsNews-simple' in x})
if whats_news is not None: if whats_news is not None:
whats_news.extract() whats_news.extract()
articles = [] articles = []
flavorarea = soup.find('div', attrs={'class':lambda x: x and 'ahed' in x}) flavorarea = soup.find('div', attrs={'class':lambda x: x and 'ahed' in x})
if flavorarea is not None: if flavorarea is not None:
flavorstory = flavorarea.find('a', href=lambda x: x and x.startswith('/article')) flavorstory = flavorarea.find('a', href=lambda x: x and x.startswith('/article'))
if flavorstory is not None: if flavorstory is not None:
flavorstory['class'] = 'mjLinkItem' flavorstory['class'] = 'mjLinkItem'
metapage = soup.find('span', attrs={'class':lambda x: x and 'meta_sectionName' in x}) metapage = soup.find('span', attrs={'class':lambda x: x and 'meta_sectionName' in x})
if metapage is not None: if metapage is not None:
flavorstory.append( copy.copy(metapage) ) #metapage should always be A1 because that should be first on the page flavorstory.append(copy.copy(metapage)) # metapage should always be A1 because that should be first on the page
for a in soup.findAll('a', attrs={'class':'mjLinkItem'}, href=True): for a in soup.findAll('a', attrs={'class':'mjLinkItem'}, href=True):
container = a.findParent(['li', 'div']) container = a.findParent(['li', 'div'])
@ -199,7 +204,6 @@ class WallStreetJournal(BasicNewsRecipe):
return articles return articles
def cleanup(self): def cleanup(self):
self.browser.open('http://online.wsj.com/logout?url=http://online.wsj.com') self.browser.open('http://online.wsj.com/logout?url=http://online.wsj.com')

View File

@ -32,7 +32,7 @@ defaults.
# Set the use_series_auto_increment_tweak_when_importing tweak to True to # Set the use_series_auto_increment_tweak_when_importing tweak to True to
# use the above values when importing/adding books. If this tweak is set to # use the above values when importing/adding books. If this tweak is set to
# False (the default) then the series number will be set to 1 if it is not # False (the default) then the series number will be set to 1 if it is not
# explicitly set to during the import. If set to True, then the # explicitly set during the import. If set to True, then the
# series index will be set according to the series_index_auto_increment setting. # series index will be set according to the series_index_auto_increment setting.
# Note that the use_series_auto_increment_tweak_when_importing tweak is used # Note that the use_series_auto_increment_tweak_when_importing tweak is used
# only when a value is not provided during import. If the importing regular # only when a value is not provided during import. If the importing regular
@ -536,3 +536,4 @@ many_libraries = 10
# yellow when using a Virtual Library. By setting this to False, you can turn # yellow when using a Virtual Library. By setting this to False, you can turn
# that off. # that off.
highlight_virtual_library_book_count = True highlight_virtual_library_book_count = True

View File

@ -38,7 +38,7 @@ binary_includes = [
'/lib/libz.so.1', '/lib/libz.so.1',
'/usr/lib/libtiff.so.5', '/usr/lib/libtiff.so.5',
'/lib/libbz2.so.1', '/lib/libbz2.so.1',
'/usr/lib/libpoppler.so.28', '/usr/lib/libpoppler.so.37',
'/usr/lib/libxml2.so.2', '/usr/lib/libxml2.so.2',
'/usr/lib/libopenjpeg.so.2', '/usr/lib/libopenjpeg.so.2',
'/usr/lib/libxslt.so.1', '/usr/lib/libxslt.so.1',

View File

@ -378,7 +378,7 @@ class Py2App(object):
@flush @flush
def add_poppler(self): def add_poppler(self):
info('\nAdding poppler') info('\nAdding poppler')
for x in ('libpoppler.28.dylib',): for x in ('libpoppler.37.dylib',):
self.install_dylib(os.path.join(SW, 'lib', x)) self.install_dylib(os.path.join(SW, 'lib', x))
for x in ('pdftohtml', 'pdftoppm', 'pdfinfo'): for x in ('pdftohtml', 'pdftoppm', 'pdfinfo'):
self.install_dylib(os.path.join(SW, 'bin', x), False) self.install_dylib(os.path.join(SW, 'bin', x), False)

View File

@ -116,7 +116,9 @@ tarball. Edit setup.py and set zip_safe=False. Then run::
Run the following command to install python dependencies:: Run the following command to install python dependencies::
easy_install --always-unzip -U mechanize pyreadline python-dateutil dnspython cssutils clientform pycrypto cssselect easy_install --always-unzip -U mechanize python-dateutil dnspython cssutils clientform pycrypto cssselect
Install pyreadline from https://pypi.python.org/pypi/pyreadline/2.0
Install pywin32 and edit win32com\__init__.py setting _frozen = True and Install pywin32 and edit win32com\__init__.py setting _frozen = True and
__gen_path__ to a temp dir (otherwise it tries to set it to a dir in the __gen_path__ to a temp dir (otherwise it tries to set it to a dir in the

View File

@ -12,14 +12,14 @@ msgstr ""
"Report-Msgid-Bugs-To: Debian iso-codes team <pkg-isocodes-" "Report-Msgid-Bugs-To: Debian iso-codes team <pkg-isocodes-"
"devel@lists.alioth.debian.org>\n" "devel@lists.alioth.debian.org>\n"
"POT-Creation-Date: 2011-11-25 14:01+0000\n" "POT-Creation-Date: 2011-11-25 14:01+0000\n"
"PO-Revision-Date: 2013-04-21 08:00+0000\n" "PO-Revision-Date: 2013-05-06 09:36+0000\n"
"Last-Translator: Ferran Rius <frius64@hotmail.com>\n" "Last-Translator: Ferran Rius <frius64@hotmail.com>\n"
"Language-Team: Catalan <linux@softcatala.org>\n" "Language-Team: Catalan <linux@softcatala.org>\n"
"MIME-Version: 1.0\n" "MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n" "Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n" "Content-Transfer-Encoding: 8bit\n"
"X-Launchpad-Export-Date: 2013-04-22 05:23+0000\n" "X-Launchpad-Export-Date: 2013-05-07 05:28+0000\n"
"X-Generator: Launchpad (build 16567)\n" "X-Generator: Launchpad (build 16598)\n"
"Language: ca\n" "Language: ca\n"
#. name for aaa #. name for aaa
@ -2024,7 +2024,7 @@ msgstr "Àzeri meridional"
#. name for aze #. name for aze
msgid "Azerbaijani" msgid "Azerbaijani"
msgstr "Serbi" msgstr ""
#. name for azg #. name for azg
msgid "Amuzgo; San Pedro Amuzgos" msgid "Amuzgo; San Pedro Amuzgos"
@ -7288,7 +7288,7 @@ msgstr "Epie"
#. name for epo #. name for epo
msgid "Esperanto" msgid "Esperanto"
msgstr "Alemany" msgstr "Esperanto"
#. name for era #. name for era
msgid "Eravallan" msgid "Eravallan"
@ -21816,7 +21816,7 @@ msgstr "Ramoaaina"
#. name for raj #. name for raj
msgid "Rajasthani" msgid "Rajasthani"
msgstr "Marwari" msgstr ""
#. name for rak #. name for rak
msgid "Tulu-Bohuai" msgid "Tulu-Bohuai"

View File

@ -13762,7 +13762,7 @@ msgstr ""
#. name for lav #. name for lav
msgid "Latvian" msgid "Latvian"
msgstr "litevština" msgstr ""
#. name for law #. name for law
msgid "Lauje" msgid "Lauje"

View File

@ -1429,7 +1429,7 @@ msgstr ""
#. name for arg #. name for arg
msgid "Aragonese" msgid "Aragonese"
msgstr "Færøsk" msgstr ""
#. name for arh #. name for arh
msgid "Arhuaco" msgid "Arhuaco"

View File

@ -18,14 +18,14 @@ msgstr ""
"Report-Msgid-Bugs-To: Debian iso-codes team <pkg-isocodes-" "Report-Msgid-Bugs-To: Debian iso-codes team <pkg-isocodes-"
"devel@lists.alioth.debian.org>\n" "devel@lists.alioth.debian.org>\n"
"POT-Creation-Date: 2011-11-25 14:01+0000\n" "POT-Creation-Date: 2011-11-25 14:01+0000\n"
"PO-Revision-Date: 2013-04-11 13:29+0000\n" "PO-Revision-Date: 2013-05-06 09:41+0000\n"
"Last-Translator: Simon Schütte <simonschuette@arcor.de>\n" "Last-Translator: Simon Schütte <simonschuette@arcor.de>\n"
"Language-Team: Ubuntu German Translators\n" "Language-Team: Ubuntu German Translators\n"
"MIME-Version: 1.0\n" "MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n" "Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n" "Content-Transfer-Encoding: 8bit\n"
"X-Launchpad-Export-Date: 2013-04-12 05:20+0000\n" "X-Launchpad-Export-Date: 2013-05-07 05:29+0000\n"
"X-Generator: Launchpad (build 16564)\n" "X-Generator: Launchpad (build 16598)\n"
"Language: de\n" "Language: de\n"
#. name for aaa #. name for aaa
@ -319,7 +319,7 @@ msgstr "Adangme"
#. name for adb #. name for adb
msgid "Adabe" msgid "Adabe"
msgstr "Adangme" msgstr "Adabe"
#. name for add #. name for add
msgid "Dzodinka" msgid "Dzodinka"
@ -367,7 +367,7 @@ msgstr "Adap"
#. name for adq #. name for adq
msgid "Adangbe" msgid "Adangbe"
msgstr "Adangme" msgstr "Adangbe"
#. name for adr #. name for adr
msgid "Adonara" msgid "Adonara"

View File

@ -2022,7 +2022,7 @@ msgstr ""
#. name for aze #. name for aze
msgid "Azerbaijani" msgid "Azerbaijani"
msgstr "Turkiera" msgstr ""
#. name for azg #. name for azg
msgid "Amuzgo; San Pedro Amuzgos" msgid "Amuzgo; San Pedro Amuzgos"
@ -13126,7 +13126,7 @@ msgstr ""
#. name for kur #. name for kur
msgid "Kurdish" msgid "Kurdish"
msgstr "Turkiera" msgstr ""
#. name for kus #. name for kus
msgid "Kusaal" msgid "Kusaal"
@ -16190,7 +16190,7 @@ msgstr ""
#. name for mlt #. name for mlt
msgid "Maltese" msgid "Maltese"
msgstr "Koreera" msgstr ""
#. name for mlu #. name for mlu
msgid "To'abaita" msgid "To'abaita"

View File

@ -13764,7 +13764,7 @@ msgstr "Laba"
#. name for lav #. name for lav
msgid "Latvian" msgid "Latvian"
msgstr "Lituano" msgstr ""
#. name for law #. name for law
msgid "Lauje" msgid "Lauje"
@ -22212,7 +22212,7 @@ msgstr "Roglai do norte"
#. name for roh #. name for roh
msgid "Romansh" msgid "Romansh"
msgstr "Romanés" msgstr ""
#. name for rol #. name for rol
msgid "Romblomanon" msgid "Romblomanon"

View File

@ -20538,7 +20538,7 @@ msgstr ""
#. name for peo #. name for peo
msgid "Persian; Old (ca. 600-400 B.C.)" msgid "Persian; Old (ca. 600-400 B.C.)"
msgstr "perzsa" msgstr ""
#. name for pep #. name for pep
msgid "Kunja" msgid "Kunja"

View File

@ -15049,7 +15049,7 @@ msgstr "Magahi"
#. name for mah #. name for mah
msgid "Marshallese" msgid "Marshallese"
msgstr "Maltneska" msgstr ""
#. name for mai #. name for mai
msgid "Maithili" msgid "Maithili"

View File

@ -3742,7 +3742,7 @@ msgstr ""
#. name for bre #. name for bre
msgid "Breton" msgid "Breton"
msgstr "프랑스어" msgstr ""
#. name for brf #. name for brf
msgid "Bera" msgid "Bera"

View File

@ -6804,7 +6804,7 @@ msgstr "डोगोन; तेबुल उरे"
#. name for dua #. name for dua
msgid "Duala" msgid "Duala"
msgstr "ड्युला" msgstr ""
#. name for dub #. name for dub
msgid "Dubli" msgid "Dubli"

View File

@ -27790,7 +27790,7 @@ msgstr ""
#. name for wln #. name for wln
msgid "Walloon" msgid "Walloon"
msgstr "Vietnamesisk" msgstr ""
#. name for wlo #. name for wlo
msgid "Wolio" msgid "Wolio"

View File

@ -9862,7 +9862,7 @@ msgstr "Hya"
#. name for hye #. name for hye
msgid "Armenian" msgid "Armenian"
msgstr "Albanés" msgstr ""
#. name for iai #. name for iai
msgid "Iaai" msgid "Iaai"
@ -13762,7 +13762,7 @@ msgstr "Laba"
#. name for lav #. name for lav
msgid "Latvian" msgid "Latvian"
msgstr "Lituanian" msgstr ""
#. name for law #. name for law
msgid "Lauje" msgid "Lauje"

View File

@ -13,14 +13,14 @@ msgstr ""
"Report-Msgid-Bugs-To: Debian iso-codes team <pkg-isocodes-" "Report-Msgid-Bugs-To: Debian iso-codes team <pkg-isocodes-"
"devel@lists.alioth.debian.org>\n" "devel@lists.alioth.debian.org>\n"
"POT-Creation-Date: 2011-11-25 14:01+0000\n" "POT-Creation-Date: 2011-11-25 14:01+0000\n"
"PO-Revision-Date: 2013-03-23 10:17+0000\n" "PO-Revision-Date: 2013-05-21 06:13+0000\n"
"Last-Translator: Глория Хрусталёва <gloriya@hushmail.com>\n" "Last-Translator: Глория Хрусталёва <gloriya@hushmail.com>\n"
"Language-Team: Russian <debian-l10n-russian@lists.debian.org>\n" "Language-Team: Russian <debian-l10n-russian@lists.debian.org>\n"
"MIME-Version: 1.0\n" "MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n" "Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n" "Content-Transfer-Encoding: 8bit\n"
"X-Launchpad-Export-Date: 2013-03-24 04:45+0000\n" "X-Launchpad-Export-Date: 2013-05-22 04:38+0000\n"
"X-Generator: Launchpad (build 16540)\n" "X-Generator: Launchpad (build 16626)\n"
"Language: ru\n" "Language: ru\n"
#. name for aaa #. name for aaa
@ -2089,7 +2089,7 @@ msgstr "Башкирский"
#. name for bal #. name for bal
msgid "Baluchi" msgid "Baluchi"
msgstr "Балийский" msgstr ""
#. name for bam #. name for bam
msgid "Bambara" msgid "Bambara"
@ -5361,7 +5361,7 @@ msgstr ""
#. name for coa #. name for coa
msgid "Malay; Cocos Islands" msgid "Malay; Cocos Islands"
msgstr "" msgstr "Малайский; Кокосовые острова"
#. name for cob #. name for cob
msgid "Chicomuceltec" msgid "Chicomuceltec"

View File

@ -13763,7 +13763,7 @@ msgstr ""
#. name for lav #. name for lav
msgid "Latvian" msgid "Latvian"
msgstr "Lotyšský" msgstr ""
#. name for law #. name for law
msgid "Lauje" msgid "Lauje"

File diff suppressed because it is too large Load Diff

View File

@ -1016,7 +1016,7 @@ msgstr ""
#. name for amh #. name for amh
msgid "Amharic" msgid "Amharic"
msgstr "阿拉伯语" msgstr ""
#. name for ami #. name for ami
msgid "Amis" msgid "Amis"

View File

@ -18,7 +18,7 @@ def qt_sources():
'src/gui/widgets/qdialogbuttonbox.cpp', 'src/gui/widgets/qdialogbuttonbox.cpp',
])) ]))
class POT(Command): # {{{ class POT(Command): # {{{
description = 'Update the .pot translation template' description = 'Update the .pot translation template'
PATH = os.path.join(Command.SRC, __appname__, 'translations') PATH = os.path.join(Command.SRC, __appname__, 'translations')
@ -63,7 +63,6 @@ class POT(Command): # {{{
return '\n'.join(ans) return '\n'.join(ans)
def run(self, opts): def run(self, opts):
pot_header = textwrap.dedent('''\ pot_header = textwrap.dedent('''\
# Translation template file.. # Translation template file..
@ -117,11 +116,10 @@ class POT(Command): # {{{
f.write(src) f.write(src)
self.info('Translations template:', os.path.abspath(pot)) self.info('Translations template:', os.path.abspath(pot))
return pot return pot
# }}} # }}}
class Translations(POT): # {{{ class Translations(POT): # {{{
description='''Compile the translations''' description='''Compile the translations'''
DEST = os.path.join(os.path.dirname(POT.SRC), 'resources', 'localization', DEST = os.path.join(os.path.dirname(POT.SRC), 'resources', 'localization',
'locales') 'locales')
@ -134,6 +132,7 @@ class Translations(POT): # {{{
return locale, os.path.join(self.DEST, locale, 'messages.mo') return locale, os.path.join(self.DEST, locale, 'messages.mo')
def run(self, opts): def run(self, opts):
self.iso639_errors = []
for f in self.po_files(): for f in self.po_files():
locale, dest = self.mo_file(f) locale, dest = self.mo_file(f)
base = os.path.dirname(dest) base = os.path.dirname(dest)
@ -146,18 +145,46 @@ class Translations(POT): # {{{
'%s.po'%iscpo) '%s.po'%iscpo)
if os.path.exists(iso639): if os.path.exists(iso639):
self.check_iso639(iso639)
dest = self.j(self.d(dest), 'iso639.mo') dest = self.j(self.d(dest), 'iso639.mo')
if self.newer(dest, iso639): if self.newer(dest, iso639):
self.info('\tCopying ISO 639 translations') self.info('\tCopying ISO 639 translations for %s' % iscpo)
subprocess.check_call(['msgfmt', '-o', dest, iso639]) subprocess.check_call(['msgfmt', '-o', dest, iso639])
elif locale not in ('en_GB', 'en_CA', 'en_AU', 'si', 'ur', 'sc', elif locale not in ('en_GB', 'en_CA', 'en_AU', 'si', 'ur', 'sc',
'ltg', 'nds', 'te', 'yi', 'fo', 'sq', 'ast', 'ml', 'ku', 'ltg', 'nds', 'te', 'yi', 'fo', 'sq', 'ast', 'ml', 'ku',
'fr_CA', 'him', 'jv', 'ka', 'fur', 'ber'): 'fr_CA', 'him', 'jv', 'ka', 'fur', 'ber'):
self.warn('No ISO 639 translations for locale:', locale) self.warn('No ISO 639 translations for locale:', locale)
if self.iso639_errors:
for err in self.iso639_errors:
print (err)
raise SystemExit(1)
self.write_stats() self.write_stats()
self.freeze_locales() self.freeze_locales()
def check_iso639(self, path):
from calibre.utils.localization import langnames_to_langcodes
with open(path, 'rb') as f:
raw = f.read()
rmap = {}
msgid = None
for match in re.finditer(r'^(msgid|msgstr)\s+"(.*?)"', raw, re.M):
if match.group(1) == 'msgid':
msgid = match.group(2)
else:
msgstr = match.group(2)
if not msgstr:
continue
omsgid = rmap.get(msgstr, None)
if omsgid is not None:
cm = langnames_to_langcodes([omsgid, msgid])
if cm[msgid] and cm[omsgid] and cm[msgid] != cm[omsgid]:
self.iso639_errors.append('In file %s the name %s is used as translation for both %s and %s' % (
os.path.basename(path), msgstr, msgid, rmap[msgstr]))
# raise SystemExit(1)
rmap[msgstr] = msgid
def freeze_locales(self): def freeze_locales(self):
zf = self.DEST + '.zip' zf = self.DEST + '.zip'
from calibre import CurrentDir from calibre import CurrentDir
@ -191,7 +218,6 @@ class Translations(POT): # {{{
locale = self.mo_file(f)[0] locale = self.mo_file(f)[0]
stats[locale] = min(1.0, float(trans)/total) stats[locale] = min(1.0, float(trans)/total)
import cPickle import cPickle
cPickle.dump(stats, open(dest, 'wb'), -1) cPickle.dump(stats, open(dest, 'wb'), -1)
@ -211,7 +237,7 @@ class Translations(POT): # {{{
# }}} # }}}
class GetTranslations(Translations): # {{{ class GetTranslations(Translations): # {{{
description = 'Get updated translations from Launchpad' description = 'Get updated translations from Launchpad'
BRANCH = 'lp:~kovid/calibre/translations' BRANCH = 'lp:~kovid/calibre/translations'
@ -286,7 +312,7 @@ class GetTranslations(Translations): # {{{
# }}} # }}}
class ISO639(Command): # {{{ class ISO639(Command): # {{{
description = 'Compile translations for ISO 639 codes' description = 'Compile translations for ISO 639 codes'
DEST = os.path.join(os.path.dirname(POT.SRC), 'resources', 'localization', DEST = os.path.join(os.path.dirname(POT.SRC), 'resources', 'localization',

View File

@ -4,7 +4,7 @@ __license__ = 'GPL v3'
__copyright__ = '2008, Kovid Goyal kovid@kovidgoyal.net' __copyright__ = '2008, Kovid Goyal kovid@kovidgoyal.net'
__docformat__ = 'restructuredtext en' __docformat__ = 'restructuredtext en'
__appname__ = u'calibre' __appname__ = u'calibre'
numeric_version = (0, 9, 29) numeric_version = (0, 9, 31)
__version__ = u'.'.join(map(unicode, numeric_version)) __version__ = u'.'.join(map(unicode, numeric_version))
__author__ = u"Kovid Goyal <kovid@kovidgoyal.net>" __author__ = u"Kovid Goyal <kovid@kovidgoyal.net>"
@ -66,10 +66,8 @@ else:
filesystem_encoding = 'utf-8' filesystem_encoding = 'utf-8'
# On linux, unicode arguments to os file functions are coerced to an ascii # On linux, unicode arguments to os file functions are coerced to an ascii
# bytestring if sys.getfilesystemencoding() == 'ascii', which is # bytestring if sys.getfilesystemencoding() == 'ascii', which is
# just plain dumb. So issue a warning. # just plain dumb. This is fixed by the icu.py module which, when
print ('WARNING: You do not have the LANG environment variable set correctly. ' # imported changes ascii to utf-8
'This will cause problems with non-ascii filenames. '
'Set it to something like en_US.UTF-8.\n')
except: except:
filesystem_encoding = 'utf-8' filesystem_encoding = 'utf-8'

View File

@ -1476,6 +1476,7 @@ class StoreKoobeStore(StoreBase):
drm_free_only = True drm_free_only = True
headquarters = 'PL' headquarters = 'PL'
formats = ['EPUB', 'MOBI', 'PDF'] formats = ['EPUB', 'MOBI', 'PDF']
affiliate = True
class StoreLegimiStore(StoreBase): class StoreLegimiStore(StoreBase):
name = 'Legimi' name = 'Legimi'
@ -1548,12 +1549,13 @@ class StoreNextoStore(StoreBase):
class StoreNookUKStore(StoreBase): class StoreNookUKStore(StoreBase):
name = 'Nook UK' name = 'Nook UK'
author = 'John Schember' author = 'Charles Haley'
description = u'Barnes & Noble S.à r.l, a subsidiary of Barnes & Noble, Inc., a leading retailer of content, digital media and educational products, is proud to bring the award-winning NOOK® reading experience and a leading digital bookstore to the UK.' # noqa description = u'Barnes & Noble S.A.R.L, a subsidiary of Barnes & Noble, Inc., a leading retailer of content, digital media and educational products, is proud to bring the award-winning NOOK reading experience and a leading digital bookstore to the UK.' # noqa
actual_plugin = 'calibre.gui2.store.stores.nook_uk_plugin:NookUKStore' actual_plugin = 'calibre.gui2.store.stores.nook_uk_plugin:NookUKStore'
headquarters = 'UK' headquarters = 'UK'
formats = ['NOOK'] formats = ['NOOK']
affiliate = True
class StoreOpenBooksStore(StoreBase): class StoreOpenBooksStore(StoreBase):
name = 'Open Books' name = 'Open Books'
@ -1659,6 +1661,7 @@ class StoreWoblinkStore(StoreBase):
headquarters = 'PL' headquarters = 'PL'
formats = ['EPUB', 'MOBI', 'PDF', 'WOBLINK'] formats = ['EPUB', 'MOBI', 'PDF', 'WOBLINK']
affiliate = True
class XinXiiStore(StoreBase): class XinXiiStore(StoreBase):
name = 'XinXii' name = 'XinXii'

View File

@ -25,7 +25,7 @@ class ANDROID(USBMS):
VENDOR_ID = { VENDOR_ID = {
# HTC # HTC
0x0bb4 : { 0xc02 : HTC_BCDS, 0x0bb4 : {0xc02 : HTC_BCDS,
0xc01 : HTC_BCDS, 0xc01 : HTC_BCDS,
0xff9 : HTC_BCDS, 0xff9 : HTC_BCDS,
0xc86 : HTC_BCDS, 0xc86 : HTC_BCDS,
@ -52,13 +52,13 @@ class ANDROID(USBMS):
}, },
# Eken # Eken
0x040d : { 0x8510 : [0x0001], 0x0851 : [0x1] }, 0x040d : {0x8510 : [0x0001], 0x0851 : [0x1]},
# Trekstor # Trekstor
0x1e68 : { 0x006a : [0x0231] }, 0x1e68 : {0x006a : [0x0231]},
# Motorola # Motorola
0x22b8 : { 0x41d9 : [0x216], 0x2d61 : [0x100], 0x2d67 : [0x100], 0x22b8 : {0x41d9 : [0x216], 0x2d61 : [0x100], 0x2d67 : [0x100],
0x2de8 : [0x229], 0x2de8 : [0x229],
0x41db : [0x216], 0x4285 : [0x216], 0x42a3 : [0x216], 0x41db : [0x216], 0x4285 : [0x216], 0x42a3 : [0x216],
0x4286 : [0x216], 0x42b3 : [0x216], 0x42b4 : [0x216], 0x4286 : [0x216], 0x42b3 : [0x216], 0x42b4 : [0x216],
@ -111,7 +111,7 @@ class ANDROID(USBMS):
}, },
# Samsung # Samsung
0x04e8 : { 0x681d : [0x0222, 0x0223, 0x0224, 0x0400], 0x04e8 : {0x681d : [0x0222, 0x0223, 0x0224, 0x0400],
0x681c : [0x0222, 0x0223, 0x0224, 0x0400], 0x681c : [0x0222, 0x0223, 0x0224, 0x0400],
0x6640 : [0x0100], 0x6640 : [0x0100],
0x685b : [0x0400, 0x0226], 0x685b : [0x0400, 0x0226],
@ -130,7 +130,7 @@ class ANDROID(USBMS):
0xc001 : [0x0226], 0xc001 : [0x0226],
0xc004 : [0x0226], 0xc004 : [0x0226],
0x8801 : [0x0226, 0x0227], 0x8801 : [0x0226, 0x0227],
0xe115 : [0x0216], # PocketBook A10 0xe115 : [0x0216], # PocketBook A10
}, },
# Another Viewsonic # Another Viewsonic
@ -139,10 +139,10 @@ class ANDROID(USBMS):
}, },
# Acer # Acer
0x502 : { 0x3203 : [0x0100, 0x224]}, 0x502 : {0x3203 : [0x0100, 0x224]},
# Dell # Dell
0x413c : { 0xb007 : [0x0100, 0x0224, 0x0226]}, 0x413c : {0xb007 : [0x0100, 0x0224, 0x0226]},
# LG # LG
0x1004 : { 0x1004 : {
@ -166,25 +166,25 @@ class ANDROID(USBMS):
# Huawei # Huawei
# Disabled as this USB id is used by various USB flash drives # Disabled as this USB id is used by various USB flash drives
#0x45e : { 0x00e1 : [0x007], }, # 0x45e : { 0x00e1 : [0x007], },
# T-Mobile # T-Mobile
0x0408 : { 0x03ba : [0x0109], }, 0x0408 : {0x03ba : [0x0109], },
# Xperia # Xperia
0x13d3 : { 0x3304 : [0x0001, 0x0002] }, 0x13d3 : {0x3304 : [0x0001, 0x0002]},
# CREEL?? Also Nextbook and Wayteq # CREEL?? Also Nextbook and Wayteq
0x5e3 : { 0x726 : [0x222] }, 0x5e3 : {0x726 : [0x222]},
# ZTE # ZTE
0x19d2 : { 0x1353 : [0x226], 0x1351 : [0x227] }, 0x19d2 : {0x1353 : [0x226], 0x1351 : [0x227]},
# Advent # Advent
0x0955 : { 0x7100 : [0x9999] }, # This is the same as the Notion Ink Adam 0x0955 : {0x7100 : [0x9999]}, # This is the same as the Notion Ink Adam
# Kobo # Kobo
0x2237: { 0x2208 : [0x0226] }, 0x2237: {0x2208 : [0x0226]},
# Lenovo # Lenovo
0x17ef : { 0x17ef : {
@ -193,10 +193,10 @@ class ANDROID(USBMS):
}, },
# Pantech # Pantech
0x10a9 : { 0x6050 : [0x227] }, 0x10a9 : {0x6050 : [0x227]},
# Prestigio and Teclast # Prestigio and Teclast
0x2207 : { 0 : [0x222], 0x10 : [0x222] }, 0x2207 : {0 : [0x222], 0x10 : [0x222]},
} }
EBOOK_DIR_MAIN = ['eBooks/import', 'wordplayer/calibretransfer', 'Books', EBOOK_DIR_MAIN = ['eBooks/import', 'wordplayer/calibretransfer', 'Books',
@ -219,7 +219,7 @@ class ANDROID(USBMS):
'POCKET', 'ONDA_MID', 'ZENITHIN', 'INGENIC', 'PMID701C', 'PD', 'POCKET', 'ONDA_MID', 'ZENITHIN', 'INGENIC', 'PMID701C', 'PD',
'PMP5097C', 'MASS', 'NOVO7', 'ZEKI', 'COBY', 'SXZ', 'USB_2.0', 'PMP5097C', 'MASS', 'NOVO7', 'ZEKI', 'COBY', 'SXZ', 'USB_2.0',
'COBY_MID', 'VS', 'AINOL', 'TOPWISE', 'PAD703', 'NEXT8D12', 'COBY_MID', 'VS', 'AINOL', 'TOPWISE', 'PAD703', 'NEXT8D12',
'MEDIATEK', 'KEENHI', 'TECLAST', 'SURFTAB'] 'MEDIATEK', 'KEENHI', 'TECLAST', 'SURFTAB', 'XENTA',]
WINDOWS_MAIN_MEM = ['ANDROID_PHONE', 'A855', 'A853', 'A953', 'INC.NEXUS_ONE', WINDOWS_MAIN_MEM = ['ANDROID_PHONE', 'A855', 'A853', 'A953', 'INC.NEXUS_ONE',
'__UMS_COMPOSITE', '_MB200', 'MASS_STORAGE', '_-_CARD', 'SGH-I897', '__UMS_COMPOSITE', '_MB200', 'MASS_STORAGE', '_-_CARD', 'SGH-I897',
'GT-I9000', 'FILE-STOR_GADGET', 'SGH-T959_CARD', 'SGH-T959', 'SAMSUNG_ANDROID', 'GT-I9000', 'FILE-STOR_GADGET', 'SGH-T959_CARD', 'SGH-T959', 'SAMSUNG_ANDROID',
@ -240,7 +240,9 @@ class ANDROID(USBMS):
'ADVANCED', 'SGH-I727', 'USB_FLASH_DRIVER', 'ANDROID', 'ADVANCED', 'SGH-I727', 'USB_FLASH_DRIVER', 'ANDROID',
'S5830I_CARD', 'MID7042', 'LINK-CREATE', '7035', 'VIEWPAD_7E', 'S5830I_CARD', 'MID7042', 'LINK-CREATE', '7035', 'VIEWPAD_7E',
'NOVO7', 'MB526', '_USB#WYK7MSF8KE', 'TABLET_PC', 'F', 'MT65XX_MS', 'NOVO7', 'MB526', '_USB#WYK7MSF8KE', 'TABLET_PC', 'F', 'MT65XX_MS',
'ICS', 'E400', '__FILE-STOR_GADG', 'ST80208-1', 'GT-S5660M_CARD', 'XT894'] 'ICS', 'E400', '__FILE-STOR_GADG', 'ST80208-1', 'GT-S5660M_CARD', 'XT894', '_USB',
'PROD_TAB13-201',
]
WINDOWS_CARD_A_MEM = ['ANDROID_PHONE', 'GT-I9000_CARD', 'SGH-I897', WINDOWS_CARD_A_MEM = ['ANDROID_PHONE', 'GT-I9000_CARD', 'SGH-I897',
'FILE-STOR_GADGET', 'SGH-T959_CARD', 'SGH-T959', 'SAMSUNG_ANDROID', 'GT-P1000_CARD', 'FILE-STOR_GADGET', 'SGH-T959_CARD', 'SGH-T959', 'SAMSUNG_ANDROID', 'GT-P1000_CARD',
'A70S', 'A101IT', '7', 'INCREDIBLE', 'A7EB', 'SGH-T849_CARD', 'A70S', 'A101IT', '7', 'INCREDIBLE', 'A7EB', 'SGH-T849_CARD',
@ -251,7 +253,9 @@ class ANDROID(USBMS):
'FILE-CD_GADGET', 'GT-I9001_CARD', 'USB_2.0', 'XT875', 'FILE-CD_GADGET', 'GT-I9001_CARD', 'USB_2.0', 'XT875',
'UMS_COMPOSITE', 'PRO', '.KOBO_VOX', 'SGH-T989_CARD', 'SGH-I727', 'UMS_COMPOSITE', 'PRO', '.KOBO_VOX', 'SGH-T989_CARD', 'SGH-I727',
'USB_FLASH_DRIVER', 'ANDROID', 'MID7042', '7035', 'VIEWPAD_7E', 'USB_FLASH_DRIVER', 'ANDROID', 'MID7042', '7035', 'VIEWPAD_7E',
'NOVO7', 'ADVANCED', 'TABLET_PC', 'F', 'E400_SD_CARD', 'ST80208-1', 'XT894'] 'NOVO7', 'ADVANCED', 'TABLET_PC', 'F', 'E400_SD_CARD', 'ST80208-1', 'XT894',
'_USB', 'PROD_TAB13-201',
]
OSX_MAIN_MEM = 'Android Device Main Memory' OSX_MAIN_MEM = 'Android Device Main Memory'
@ -366,7 +370,6 @@ class WEBOS(USBMS):
except ImportError: except ImportError:
import Image, ImageDraw import Image, ImageDraw
coverdata = getattr(metadata, 'thumbnail', None) coverdata = getattr(metadata, 'thumbnail', None)
if coverdata and coverdata[2]: if coverdata and coverdata[2]:
cover = Image.open(cStringIO.StringIO(coverdata[2])) cover = Image.open(cStringIO.StringIO(coverdata[2]))
@ -415,3 +418,4 @@ class WEBOS(USBMS):
coverfile.write(coverdata) coverfile.write(coverdata)

File diff suppressed because it is too large Load Diff

View File

@ -19,10 +19,10 @@ class BLACKBERRY(USBMS):
VENDOR_ID = [0x0fca] VENDOR_ID = [0x0fca]
PRODUCT_ID = [0x8004, 0x0004] PRODUCT_ID = [0x8004, 0x0004]
BCD = [0x0200, 0x0107, 0x0210, 0x0201, 0x0211, 0x0220] BCD = [0x0200, 0x0107, 0x0210, 0x0201, 0x0211, 0x0220, 0x232]
VENDOR_NAME = 'RIM' VENDOR_NAME = 'RIM'
WINDOWS_MAIN_MEM = 'BLACKBERRY_SD' WINDOWS_MAIN_MEM = WINDOWS_CARD_A_MEM = ['BLACKBERRY_SD', 'BLACKBERRY']
MAIN_MEMORY_VOLUME_LABEL = 'Blackberry SD Card' MAIN_MEMORY_VOLUME_LABEL = 'Blackberry SD Card'

View File

@ -279,11 +279,11 @@ class POCKETBOOK602(USBMS):
class POCKETBOOK622(POCKETBOOK602): class POCKETBOOK622(POCKETBOOK602):
name = 'PocketBook 622 Device Interface' name = 'PocketBook 622 Device Interface'
description = _('Communicate with the PocketBook 622 reader.') description = _('Communicate with the PocketBook 622 and 623 readers.')
EBOOK_DIR_MAIN = '' EBOOK_DIR_MAIN = ''
VENDOR_ID = [0x0489] VENDOR_ID = [0x0489]
PRODUCT_ID = [0xe107] PRODUCT_ID = [0xe107, 0xcff1]
BCD = [0x0326] BCD = [0x0326]
VENDOR_NAME = 'LINUX' VENDOR_NAME = 'LINUX'

View File

@ -0,0 +1,2 @@
__license__ = 'GPL v3'
__copyright__ = '2008, Kovid Goyal <kovid at kovidgoyal.net>'

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,300 @@
#!/usr/bin/env python
from __future__ import (unicode_literals, division, absolute_import,
print_function)
"""
https://github.com/ishikawa/python-plist-parser/blob/master/plist_parser.py
A `Property Lists`_ is a data representation used in Apple's Mac OS X as
a convenient way to store standard object types, such as string, number,
boolean, and container object.
This file contains a class ``XmlPropertyListParser`` for parse
a property list file and get back a python native data structure.
:copyright: 2008 by Takanori Ishikawa <takanori.ishikawa@gmail.com>
:license: MIT (See LICENSE file for more details)
.. _Property Lists: http://developer.apple.com/documentation/Cocoa/Conceptual/PropertyLists/
"""
class PropertyListParseError(Exception):
"""Raised when parsing a property list is failed."""
pass
class XmlPropertyListParser(object):
"""
The ``XmlPropertyListParser`` class provides methods that
convert `Property Lists`_ objects from xml format.
Property list objects include ``string``, ``unicode``,
``list``, ``dict``, ``datetime``, and ``int`` or ``float``.
:copyright: 2008 by Takanori Ishikawa <takanori.ishikawa@gmail.com>
:license: MIT License
.. _Property List: http://developer.apple.com/documentation/Cocoa/Conceptual/PropertyLists/
"""
def _assert(self, test, message):
if not test:
raise PropertyListParseError(message)
# ------------------------------------------------
# SAX2: ContentHandler
# ------------------------------------------------
def setDocumentLocator(self, locator):
pass
def startPrefixMapping(self, prefix, uri):
pass
def endPrefixMapping(self, prefix):
pass
def startElementNS(self, name, qname, attrs):
pass
def endElementNS(self, name, qname):
pass
def ignorableWhitespace(self, whitespace):
pass
def processingInstruction(self, target, data):
pass
def skippedEntity(self, name):
pass
def startDocument(self):
self.__stack = []
self.__plist = self.__key = self.__characters = None
# For reducing runtime type checking,
# the parser caches top level object type.
self.__in_dict = False
def endDocument(self):
self._assert(self.__plist is not None, "A top level element must be <plist>.")
self._assert(
len(self.__stack) is 0,
"multiple objects at top level.")
def startElement(self, name, attributes):
if name in XmlPropertyListParser.START_CALLBACKS:
XmlPropertyListParser.START_CALLBACKS[name](self, name, attributes)
if name in XmlPropertyListParser.PARSE_CALLBACKS:
self.__characters = []
def endElement(self, name):
if name in XmlPropertyListParser.END_CALLBACKS:
XmlPropertyListParser.END_CALLBACKS[name](self, name)
if name in XmlPropertyListParser.PARSE_CALLBACKS:
# Creates character string from buffered characters.
content = ''.join(self.__characters)
# For compatibility with ``xml.etree`` and ``plistlib``,
# convert text string to ascii, if possible
try:
content = content.encode('ascii')
except (UnicodeError, AttributeError):
pass
XmlPropertyListParser.PARSE_CALLBACKS[name](self, name, content)
self.__characters = None
def characters(self, content):
if self.__characters is not None:
self.__characters.append(content)
# ------------------------------------------------
# XmlPropertyListParser private
# ------------------------------------------------
def _push_value(self, value):
if not self.__stack:
self._assert(self.__plist is None, "Multiple objects at top level")
self.__plist = value
else:
top = self.__stack[-1]
#assert isinstance(top, (dict, list))
if self.__in_dict:
k = self.__key
if k is None:
raise PropertyListParseError("Missing key for dictionary.")
top[k] = value
self.__key = None
else:
top.append(value)
def _push_stack(self, value):
self.__stack.append(value)
self.__in_dict = isinstance(value, dict)
def _pop_stack(self):
self.__stack.pop()
self.__in_dict = self.__stack and isinstance(self.__stack[-1], dict)
def _start_plist(self, name, attrs):
self._assert(not self.__stack and self.__plist is None, "<plist> more than once.")
self._assert(attrs.get('version', '1.0') == '1.0',
"version 1.0 is only supported, but was '%s'." % attrs.get('version'))
def _start_array(self, name, attrs):
v = list()
self._push_value(v)
self._push_stack(v)
def _start_dict(self, name, attrs):
v = dict()
self._push_value(v)
self._push_stack(v)
def _end_array(self, name):
self._pop_stack()
def _end_dict(self, name):
if self.__key is not None:
raise PropertyListParseError("Missing value for key '%s'" % self.__key)
self._pop_stack()
def _start_true(self, name, attrs):
self._push_value(True)
def _start_false(self, name, attrs):
self._push_value(False)
def _parse_key(self, name, content):
if not self.__in_dict:
print("XmlPropertyListParser() WARNING: ignoring <key>%s</key> (<key> elements must be contained in <dict> element)" % content)
#raise PropertyListParseError("<key> element '%s' must be in <dict> element." % content)
else:
self.__key = content
def _parse_string(self, name, content):
self._push_value(content)
def _parse_data(self, name, content):
import base64
self._push_value(base64.b64decode(content))
# http://www.apple.com/DTDs/PropertyList-1.0.dtd says:
#
# Contents should conform to a subset of ISO 8601
# (in particular, YYYY '-' MM '-' DD 'T' HH ':' MM ':' SS 'Z'.
# Smaller units may be omitted with a loss of precision)
import re
DATETIME_PATTERN = re.compile(r"(?P<year>\d\d\d\d)(?:-(?P<month>\d\d)(?:-(?P<day>\d\d)(?:T(?P<hour>\d\d)(?::(?P<minute>\d\d)(?::(?P<second>\d\d))?)?)?)?)?Z$")
def _parse_date(self, name, content):
import datetime
units = ('year', 'month', 'day', 'hour', 'minute', 'second', )
pattern = XmlPropertyListParser.DATETIME_PATTERN
match = pattern.match(content)
if not match:
raise PropertyListParseError("Failed to parse datetime '%s'" % content)
groups, components = match.groupdict(), []
for key in units:
value = groups[key]
if value is None:
break
components.append(int(value))
while len(components) < 3:
components.append(1)
d = datetime.datetime(*components)
self._push_value(d)
def _parse_real(self, name, content):
self._push_value(float(content))
def _parse_integer(self, name, content):
self._push_value(int(content))
START_CALLBACKS = {
'plist': _start_plist,
'array': _start_array,
'dict': _start_dict,
'true': _start_true,
'false': _start_false,
}
END_CALLBACKS = {
'array': _end_array,
'dict': _end_dict,
}
PARSE_CALLBACKS = {
'key': _parse_key,
'string': _parse_string,
'data': _parse_data,
'date': _parse_date,
'real': _parse_real,
'integer': _parse_integer,
}
# ------------------------------------------------
# XmlPropertyListParser
# ------------------------------------------------
def _to_stream(self, io_or_string):
if isinstance(io_or_string, basestring):
# Creates a string stream for in-memory contents.
from cStringIO import StringIO
return StringIO(io_or_string)
elif hasattr(io_or_string, 'read') and callable(getattr(io_or_string, 'read')):
return io_or_string
else:
raise TypeError('Can\'t convert %s to file-like-object' % type(io_or_string))
def _parse_using_etree(self, xml_input):
from xml.etree.cElementTree import iterparse
parser = iterparse(self._to_stream(xml_input), events=(b'start', b'end'))
self.startDocument()
try:
for action, element in parser:
name = element.tag
if action == 'start':
if name in XmlPropertyListParser.START_CALLBACKS:
XmlPropertyListParser.START_CALLBACKS[name](self, element.tag, element.attrib)
elif action == 'end':
if name in XmlPropertyListParser.END_CALLBACKS:
XmlPropertyListParser.END_CALLBACKS[name](self, name)
if name in XmlPropertyListParser.PARSE_CALLBACKS:
XmlPropertyListParser.PARSE_CALLBACKS[name](self, name, element.text or "")
element.clear()
except SyntaxError, e:
raise PropertyListParseError(e)
self.endDocument()
return self.__plist
def _parse_using_sax_parser(self, xml_input):
from xml.sax import make_parser, xmlreader, SAXParseException
source = xmlreader.InputSource()
source.setByteStream(self._to_stream(xml_input))
reader = make_parser()
reader.setContentHandler(self)
try:
reader.parse(source)
except SAXParseException, e:
raise PropertyListParseError(e)
return self.__plist
def parse(self, xml_input):
"""
Parse the property list (`.plist`, `.xml, for example) ``xml_input``,
which can be either a string or a file-like object.
>>> parser = XmlPropertyListParser()
>>> parser.parse(r'<plist version="1.0">'
... r'<dict><key>Python</key><string>.py</string></dict>'
... r'</plist>')
{'Python': '.py'}
"""
try:
return self._parse_using_etree(xml_input)
except ImportError:
# No xml.etree.ccElementTree found.
return self._parse_using_sax_parser(xml_input)

View File

@ -107,6 +107,12 @@ class DevicePlugin(Plugin):
#: :meth:`set_user_blacklisted_devices` #: :meth:`set_user_blacklisted_devices`
ASK_TO_ALLOW_CONNECT = False ASK_TO_ALLOW_CONNECT = False
#: Set this to a dictionary of the form {'title':title, 'msg':msg, 'det_msg':detailed_msg} to have calibre popup
#: a message to the user after some callbacks are run (currently only upload_books).
#: Be careful to not spam the user with too many messages. This variable is checked after *every* callback,
#: so only set it when you really need to.
user_feedback_after_callback = None
@classmethod @classmethod
def get_gui_name(cls): def get_gui_name(cls):
if hasattr(cls, 'gui_name'): if hasattr(cls, 'gui_name'):
@ -157,16 +163,15 @@ class DevicePlugin(Plugin):
if (vid in device_id or vidd in device_id) and \ if (vid in device_id or vidd in device_id) and \
(pid in device_id or pidd in device_id) and \ (pid in device_id or pidd in device_id) and \
self.test_bcd_windows(device_id, bcd): self.test_bcd_windows(device_id, bcd):
if debug: if debug:
self.print_usb_device_info(device_id) self.print_usb_device_info(device_id)
if only_presence or self.can_handle_windows(device_id, debug=debug): if only_presence or self.can_handle_windows(device_id, debug=debug):
try: try:
bcd = int(device_id.rpartition( bcd = int(device_id.rpartition(
'rev_')[-1].replace(':', 'a'), 16) 'rev_')[-1].replace(':', 'a'), 16)
except: except:
bcd = None bcd = None
return True, (vendor_id, product_id, bcd, None, return True, (vendor_id, product_id, bcd, None, None, None)
None, None)
return False, None return False, None
def test_bcd(self, bcdDevice, bcd): def test_bcd(self, bcdDevice, bcd):
@ -638,7 +643,6 @@ class DevicePlugin(Plugin):
''' '''
device_prefs.set_overrides() device_prefs.set_overrides()
# Dynamic control interface. # Dynamic control interface.
# The following methods are probably called on the GUI thread. Any driver # The following methods are probably called on the GUI thread. Any driver
# that implements these methods must take pains to be thread safe, because # that implements these methods must take pains to be thread safe, because

View File

@ -35,7 +35,7 @@ class KOBO(USBMS):
gui_name = 'Kobo Reader' gui_name = 'Kobo Reader'
description = _('Communicate with the Kobo Reader') description = _('Communicate with the Kobo Reader')
author = 'Timothy Legge and David Forrester' author = 'Timothy Legge and David Forrester'
version = (2, 0, 9) version = (2, 0, 10)
dbversion = 0 dbversion = 0
fwversion = 0 fwversion = 0
@ -45,6 +45,7 @@ class KOBO(USBMS):
supported_platforms = ['windows', 'osx', 'linux'] supported_platforms = ['windows', 'osx', 'linux']
booklist_class = CollectionsBookList booklist_class = CollectionsBookList
book_class = Book
# Ordered list of supported formats # Ordered list of supported formats
FORMATS = ['epub', 'pdf', 'txt', 'cbz', 'cbr'] FORMATS = ['epub', 'pdf', 'txt', 'cbz', 'cbr']
@ -115,7 +116,6 @@ class KOBO(USBMS):
def initialize(self): def initialize(self):
USBMS.initialize(self) USBMS.initialize(self)
self.book_class = Book
self.dbversion = 7 self.dbversion = 7
def books(self, oncard=None, end_session=True): def books(self, oncard=None, end_session=True):
@ -1213,7 +1213,7 @@ class KOBOTOUCH(KOBO):
min_dbversion_archive = 71 min_dbversion_archive = 71
min_dbversion_images_on_sdcard = 77 min_dbversion_images_on_sdcard = 77
max_supported_fwversion = (2,5,1) max_supported_fwversion = (2,5,3)
min_fwversion_images_on_sdcard = (2,4,1) min_fwversion_images_on_sdcard = (2,4,1)
has_kepubs = True has_kepubs = True
@ -1237,11 +1237,9 @@ class KOBOTOUCH(KOBO):
_('Keep cover aspect ratio') + _('Keep cover aspect ratio') +
':::'+_('When uploading covers, do not change the aspect ratio when resizing for the device.' ':::'+_('When uploading covers, do not change the aspect ratio when resizing for the device.'
' This is for firmware versions 2.3.1 and later.'), ' This is for firmware versions 2.3.1 and later.'),
_('Show expired books') + _('Show archived books') +
':::'+_('A bug in an earlier version left non kepubs book records' ':::'+_('Archived books are listed on the device but need to be downloaded to read.'
' in the database. With this option Calibre will show the ' ' Use this option to show these books and match them with books in the calibre library.'),
'expired records and allow you to delete them with '
'the new delete logic.'),
_('Show Previews') + _('Show Previews') +
':::'+_('Kobo previews are included on the Touch and some other versions' ':::'+_('Kobo previews are included on the Touch and some other versions'
' by default they are no longer displayed as there is no good reason to ' ' by default they are no longer displayed as there is no good reason to '
@ -1289,7 +1287,7 @@ class KOBOTOUCH(KOBO):
OPT_UPLOAD_COVERS = 3 OPT_UPLOAD_COVERS = 3
OPT_UPLOAD_GRAYSCALE_COVERS = 4 OPT_UPLOAD_GRAYSCALE_COVERS = 4
OPT_KEEP_COVER_ASPECT_RATIO = 5 OPT_KEEP_COVER_ASPECT_RATIO = 5
OPT_SHOW_EXPIRED_BOOK_RECORDS = 6 OPT_SHOW_ARCHIVED_BOOK_RECORDS = 6
OPT_SHOW_PREVIEWS = 7 OPT_SHOW_PREVIEWS = 7
OPT_SHOW_RECOMMENDATIONS = 8 OPT_SHOW_RECOMMENDATIONS = 8
OPT_UPDATE_SERIES_DETAILS = 9 OPT_UPDATE_SERIES_DETAILS = 9
@ -1347,6 +1345,10 @@ class KOBOTOUCH(KOBO):
self.set_device_name() self.set_device_name()
return super(KOBOTOUCH, self).get_device_information(end_session) return super(KOBOTOUCH, self).get_device_information(end_session)
def device_database_path(self):
return self.normalize_path(self._main_prefix + '.kobo/KoboReader.sqlite')
def books(self, oncard=None, end_session=True): def books(self, oncard=None, end_session=True):
debug_print("KoboTouch:books - oncard='%s'"%oncard) debug_print("KoboTouch:books - oncard='%s'"%oncard)
from calibre.ebooks.metadata.meta import path_to_ext from calibre.ebooks.metadata.meta import path_to_ext
@ -1599,9 +1601,7 @@ class KOBOTOUCH(KOBO):
self.debug_index = 0 self.debug_index = 0
import sqlite3 as sqlite import sqlite3 as sqlite
with closing(sqlite.connect( with closing(sqlite.connect(self.device_database_path())) as connection:
self.normalize_path(self._main_prefix +
'.kobo/KoboReader.sqlite'))) as connection:
debug_print("KoboTouch:books - reading device database") debug_print("KoboTouch:books - reading device database")
# return bytestrings if the content cannot the decoded as unicode # return bytestrings if the content cannot the decoded as unicode
@ -1618,7 +1618,21 @@ class KOBOTOUCH(KOBO):
debug_print("KoboTouch:books - shelf list:", self.bookshelvelist) debug_print("KoboTouch:books - shelf list:", self.bookshelvelist)
opts = self.settings() opts = self.settings()
if self.supports_series(): if self.supports_kobo_archive():
query= ("select Title, Attribution, DateCreated, ContentID, MimeType, ContentType, " \
"ImageID, ReadStatus, ___ExpirationStatus, FavouritesIndex, Accessibility, " \
"IsDownloaded, Series, SeriesNumber, ___UserID " \
" from content " \
" where BookID is Null " \
" and ((Accessibility = -1 and IsDownloaded in ('true', 1 )) or (Accessibility in (1,2) %(expiry)s) " \
" %(previews)s %(recomendations)s )" \
" and not ((___ExpirationStatus=3 or ___ExpirationStatus is Null) and ContentType = 6)") % \
dict(\
expiry="" if opts.extra_customization[self.OPT_SHOW_ARCHIVED_BOOK_RECORDS] else "and IsDownloaded in ('true', 1)", \
previews=" or (Accessibility in (6) and ___UserID <> '')" if opts.extra_customization[self.OPT_SHOW_PREVIEWS] else "", \
recomendations=" or (Accessibility in (-1, 4, 6) and ___UserId = '')" if opts.extra_customization[self.OPT_SHOW_RECOMMENDATIONS] else "" \
)
elif self.supports_series():
query= ("select Title, Attribution, DateCreated, ContentID, MimeType, ContentType, " \ query= ("select Title, Attribution, DateCreated, ContentID, MimeType, ContentType, " \
"ImageID, ReadStatus, ___ExpirationStatus, FavouritesIndex, Accessibility, " \ "ImageID, ReadStatus, ___ExpirationStatus, FavouritesIndex, Accessibility, " \
"IsDownloaded, Series, SeriesNumber, ___UserID " \ "IsDownloaded, Series, SeriesNumber, ___UserID " \
@ -1627,7 +1641,7 @@ class KOBOTOUCH(KOBO):
" and ((Accessibility = -1 and IsDownloaded in ('true', 1)) or (Accessibility in (1,2)) %(previews)s %(recomendations)s )" \ " and ((Accessibility = -1 and IsDownloaded in ('true', 1)) or (Accessibility in (1,2)) %(previews)s %(recomendations)s )" \
" and not ((___ExpirationStatus=3 or ___ExpirationStatus is Null) %(expiry)s") % \ " and not ((___ExpirationStatus=3 or ___ExpirationStatus is Null) %(expiry)s") % \
dict(\ dict(\
expiry=" and ContentType = 6)" if opts.extra_customization[self.OPT_SHOW_EXPIRED_BOOK_RECORDS] else ")", \ expiry=" and ContentType = 6)" if opts.extra_customization[self.OPT_SHOW_ARCHIVED_BOOK_RECORDS] else ")", \
previews=" or (Accessibility in (6) and ___UserID <> '')" if opts.extra_customization[self.OPT_SHOW_PREVIEWS] else "", \ previews=" or (Accessibility in (6) and ___UserID <> '')" if opts.extra_customization[self.OPT_SHOW_PREVIEWS] else "", \
recomendations=" or (Accessibility in (-1, 4, 6) and ___UserId = '')" if opts.extra_customization[self.OPT_SHOW_RECOMMENDATIONS] else "" \ recomendations=" or (Accessibility in (-1, 4, 6) and ___UserId = '')" if opts.extra_customization[self.OPT_SHOW_RECOMMENDATIONS] else "" \
) )
@ -1638,7 +1652,7 @@ class KOBOTOUCH(KOBO):
' from content ' \ ' from content ' \
' where BookID is Null %(previews)s %(recomendations)s and not ((___ExpirationStatus=3 or ___ExpirationStatus is Null) %(expiry)s') % \ ' where BookID is Null %(previews)s %(recomendations)s and not ((___ExpirationStatus=3 or ___ExpirationStatus is Null) %(expiry)s') % \
dict(\ dict(\
expiry=' and ContentType = 6)' if opts.extra_customization[self.OPT_SHOW_EXPIRED_BOOK_RECORDS] else ')', \ expiry=' and ContentType = 6)' if opts.extra_customization[self.OPT_SHOW_ARCHIVED_BOOK_RECORDS] else ')', \
previews=' and Accessibility <> 6' if opts.extra_customization[self.OPT_SHOW_PREVIEWS] == False else '', \ previews=' and Accessibility <> 6' if opts.extra_customization[self.OPT_SHOW_PREVIEWS] == False else '', \
recomendations=' and IsDownloaded in (\'true\', 1)' if opts.extra_customization[self.OPT_SHOW_RECOMMENDATIONS] == False else ''\ recomendations=' and IsDownloaded in (\'true\', 1)' if opts.extra_customization[self.OPT_SHOW_RECOMMENDATIONS] == False else ''\
) )
@ -1648,7 +1662,7 @@ class KOBOTOUCH(KOBO):
'"1" as IsDownloaded, null as Series, null as SeriesNumber, ___UserID' \ '"1" as IsDownloaded, null as Series, null as SeriesNumber, ___UserID' \
' from content where ' \ ' from content where ' \
'BookID is Null and not ((___ExpirationStatus=3 or ___ExpirationStatus is Null) %(expiry)s') % dict(expiry=' and ContentType = 6)' \ 'BookID is Null and not ((___ExpirationStatus=3 or ___ExpirationStatus is Null) %(expiry)s') % dict(expiry=' and ContentType = 6)' \
if opts.extra_customization[self.OPT_SHOW_EXPIRED_BOOK_RECORDS] else ')') if opts.extra_customization[self.OPT_SHOW_ARCHIVED_BOOK_RECORDS] else ')')
else: else:
query= 'select Title, Attribution, DateCreated, ContentID, MimeType, ContentType, ' \ query= 'select Title, Attribution, DateCreated, ContentID, MimeType, ContentType, ' \
'ImageID, ReadStatus, "-1" as ___ExpirationStatus, "-1" as FavouritesIndex, "-1" as Accessibility, ' \ 'ImageID, ReadStatus, "-1" as ___ExpirationStatus, "-1" as FavouritesIndex, "-1" as Accessibility, ' \
@ -2586,7 +2600,7 @@ class KOBOTOUCH(KOBO):
def modify_database_check(self, function): def modify_database_check(self, function):
# Checks to see whether the database version is supported # Checks to see whether the database version is supported
# and whether the user has chosen to support the firmware version # and whether the user has chosen to support the firmware version
# debug_print("KoboTouch:modify_database_check - self.fwversion <= self.max_supported_fwversion=", self.fwversion > self.max_supported_fwversion) # debug_print("KoboTouch:modify_database_check - self.fwversion > self.max_supported_fwversion=", self.fwversion > self.max_supported_fwversion)
if self.dbversion > self.supported_dbversion or self.fwversion > self.max_supported_fwversion: if self.dbversion > self.supported_dbversion or self.fwversion > self.max_supported_fwversion:
# Unsupported database # Unsupported database
opts = self.settings() opts = self.settings()

View File

@ -27,7 +27,7 @@ class NOOK(USBMS):
# Ordered list of supported formats # Ordered list of supported formats
FORMATS = ['epub', 'pdb', 'pdf'] FORMATS = ['epub', 'pdb', 'pdf']
VENDOR_ID = [0x2080, 0x18d1] # 0x18d1 is for softrooted nook VENDOR_ID = [0x2080, 0x18d1] # 0x18d1 is for softrooted nook
PRODUCT_ID = [0x001] PRODUCT_ID = [0x001]
BCD = [0x322] BCD = [0x322]
@ -53,7 +53,6 @@ class NOOK(USBMS):
except ImportError: except ImportError:
import Image, ImageDraw import Image, ImageDraw
coverdata = getattr(metadata, 'thumbnail', None) coverdata = getattr(metadata, 'thumbnail', None)
if coverdata and coverdata[2]: if coverdata and coverdata[2]:
cover = Image.open(cStringIO.StringIO(coverdata[2])) cover = Image.open(cStringIO.StringIO(coverdata[2]))
@ -87,12 +86,13 @@ class NOOK_COLOR(NOOK):
PRODUCT_ID = [0x002, 0x003, 0x004] PRODUCT_ID = [0x002, 0x003, 0x004]
if isosx: if isosx:
PRODUCT_ID.append(0x005) # Nook HD+ PRODUCT_ID.append(0x005) # Nook HD+
BCD = [0x216] BCD = [0x216]
WINDOWS_MAIN_MEM = WINDOWS_CARD_A_MEM = ['EBOOK_DISK', 'NOOK_TABLET', WINDOWS_MAIN_MEM = WINDOWS_CARD_A_MEM = ['EBOOK_DISK', 'NOOK_TABLET',
'NOOK_SIMPLETOUCH'] 'NOOK_SIMPLETOUCH']
EBOOK_DIR_MAIN = 'My Files' EBOOK_DIR_MAIN = 'My Files'
SCAN_FROM_ROOT = True
NEWS_IN_FOLDER = False NEWS_IN_FOLDER = False
def upload_cover(self, path, filename, metadata, filepath): def upload_cover(self, path, filename, metadata, filepath):

View File

@ -39,8 +39,8 @@ class PRST1(USBMS):
path_sep = '/' path_sep = '/'
booklist_class = CollectionsBookList booklist_class = CollectionsBookList
FORMATS = ['epub', 'pdf', 'txt', 'book', 'zbf'] # The last two are FORMATS = ['epub', 'pdf', 'txt', 'book', 'zbf'] # The last two are
# used in japan # used in japan
CAN_SET_METADATA = ['collections'] CAN_SET_METADATA = ['collections']
CAN_DO_DEVICE_DB_PLUGBOARD = True CAN_DO_DEVICE_DB_PLUGBOARD = True
@ -50,10 +50,10 @@ class PRST1(USBMS):
VENDOR_NAME = 'SONY' VENDOR_NAME = 'SONY'
WINDOWS_MAIN_MEM = re.compile( WINDOWS_MAIN_MEM = re.compile(
r'(PRS-T(1|2)&)' r'(PRS-T(1|2|2N)&)'
) )
WINDOWS_CARD_A_MEM = re.compile( WINDOWS_CARD_A_MEM = re.compile(
r'(PRS-T(1|2)__SD&)' r'(PRS-T(1|2|2N)__SD&)'
) )
MAIN_MEMORY_VOLUME_LABEL = 'SONY Reader Main Memory' MAIN_MEMORY_VOLUME_LABEL = 'SONY Reader Main Memory'
STORAGE_CARD_VOLUME_LABEL = 'SONY Reader Storage Card' STORAGE_CARD_VOLUME_LABEL = 'SONY Reader Storage Card'
@ -66,7 +66,7 @@ class PRST1(USBMS):
EXTRA_CUSTOMIZATION_MESSAGE = [ EXTRA_CUSTOMIZATION_MESSAGE = [
_('Comma separated list of metadata fields ' _('Comma separated list of metadata fields '
'to turn into collections on the device. Possibilities include: ')+\ 'to turn into collections on the device. Possibilities include: ')+
'series, tags, authors', 'series, tags, authors',
_('Upload separate cover thumbnails for books') + _('Upload separate cover thumbnails for books') +
':::'+_('Normally, the SONY readers get the cover image from the' ':::'+_('Normally, the SONY readers get the cover image from the'
@ -194,17 +194,17 @@ class PRST1(USBMS):
time_offsets = {} time_offsets = {}
for i, row in enumerate(cursor): for i, row in enumerate(cursor):
try: try:
comp_date = int(os.path.getmtime(self.normalize_path(prefix + row[0])) * 1000); comp_date = int(os.path.getmtime(self.normalize_path(prefix + row[0])) * 1000)
except (OSError, IOError, TypeError): except (OSError, IOError, TypeError):
# In case the db has incorrect path info # In case the db has incorrect path info
continue continue
device_date = int(row[1]); device_date = int(row[1])
offset = device_date - comp_date offset = device_date - comp_date
time_offsets.setdefault(offset, 0) time_offsets.setdefault(offset, 0)
time_offsets[offset] = time_offsets[offset] + 1 time_offsets[offset] = time_offsets[offset] + 1
try: try:
device_offset = max(time_offsets,key = lambda a: time_offsets.get(a)) device_offset = max(time_offsets, key=lambda a: time_offsets.get(a))
debug_print("Device Offset: %d ms"%device_offset) debug_print("Device Offset: %d ms"%device_offset)
self.device_offset = device_offset self.device_offset = device_offset
except ValueError: except ValueError:
@ -213,7 +213,7 @@ class PRST1(USBMS):
for idx, book in enumerate(bl): for idx, book in enumerate(bl):
query = 'SELECT _id, thumbnail FROM books WHERE file_path = ?' query = 'SELECT _id, thumbnail FROM books WHERE file_path = ?'
t = (book.lpath,) t = (book.lpath,)
cursor.execute (query, t) cursor.execute(query, t)
for i, row in enumerate(cursor): for i, row in enumerate(cursor):
book.device_collections = bl_collections.get(row[0], None) book.device_collections = bl_collections.get(row[0], None)
@ -318,14 +318,14 @@ class PRST1(USBMS):
' any notes/highlights, etc.')%dbpath)+' Underlying error:' ' any notes/highlights, etc.')%dbpath)+' Underlying error:'
'\n'+tb) '\n'+tb)
def get_lastrowid(self, cursor): def get_lastrowid(self, cursor):
# SQLite3 + Python has a fun issue on 32-bit systems with integer overflows. # SQLite3 + Python has a fun issue on 32-bit systems with integer overflows.
# Issue a SQL query instead, getting the value as a string, and then converting to a long python int manually. # Issue a SQL query instead, getting the value as a string, and then converting to a long python int manually.
query = 'SELECT last_insert_rowid()' query = 'SELECT last_insert_rowid()'
cursor.execute(query) cursor.execute(query)
row = cursor.fetchone() row = cursor.fetchone()
return long(row[0]) return long(row[0])
def get_database_min_id(self, source_id): def get_database_min_id(self, source_id):
sequence_min = 0L sequence_min = 0L
@ -345,7 +345,7 @@ class PRST1(USBMS):
# Insert the sequence Id if it doesn't # Insert the sequence Id if it doesn't
query = ('INSERT INTO sqlite_sequence (name, seq) ' query = ('INSERT INTO sqlite_sequence (name, seq) '
'SELECT ?, ? ' 'SELECT ?, ? '
'WHERE NOT EXISTS (SELECT 1 FROM sqlite_sequence WHERE name = ?)'); 'WHERE NOT EXISTS (SELECT 1 FROM sqlite_sequence WHERE name = ?)')
cursor.execute(query, (table, sequence_id, table,)) cursor.execute(query, (table, sequence_id, table,))
cursor.close() cursor.close()

View File

@ -875,6 +875,9 @@ class SMART_DEVICE_APP(DeviceConfig, DevicePlugin):
self.client_device_kind = result.get('deviceKind', '') self.client_device_kind = result.get('deviceKind', '')
self._debug('Client device kind', self.client_device_kind) self._debug('Client device kind', self.client_device_kind)
self.client_device_name = result.get('deviceName', self.client_device_kind)
self._debug('Client device name', self.client_device_name)
self.max_book_packet_len = result.get('maxBookContentPacketLen', self.max_book_packet_len = result.get('maxBookContentPacketLen',
self.BASE_PACKET_LEN) self.BASE_PACKET_LEN)
self._debug('max_book_packet_len', self.max_book_packet_len) self._debug('max_book_packet_len', self.max_book_packet_len)
@ -946,6 +949,8 @@ class SMART_DEVICE_APP(DeviceConfig, DevicePlugin):
return False return False
def get_gui_name(self): def get_gui_name(self):
if getattr(self, 'client_device_name', None):
return self.gui_name_template%(self.gui_name, self.client_device_name)
if getattr(self, 'client_device_kind', None): if getattr(self, 'client_device_kind', None):
return self.gui_name_template%(self.gui_name, self.client_device_kind) return self.gui_name_template%(self.gui_name, self.client_device_kind)
return self.gui_name return self.gui_name

View File

@ -91,14 +91,15 @@ class TXTInput(InputFormatPlugin):
log.debug('Using user specified input encoding of %s' % ienc) log.debug('Using user specified input encoding of %s' % ienc)
else: else:
det_encoding = detect(txt) det_encoding = detect(txt)
det_encoding, confidence = det_encoding['encoding'], det_encoding['confidence']
if det_encoding and det_encoding.lower().replace('_', '-').strip() in ( if det_encoding and det_encoding.lower().replace('_', '-').strip() in (
'gb2312', 'chinese', 'csiso58gb231280', 'euc-cn', 'euccn', 'gb2312', 'chinese', 'csiso58gb231280', 'euc-cn', 'euccn',
'eucgb2312-cn', 'gb2312-1980', 'gb2312-80', 'iso-ir-58'): 'eucgb2312-cn', 'gb2312-1980', 'gb2312-80', 'iso-ir-58'):
# Microsoft Word exports to HTML with encoding incorrectly set to # Microsoft Word exports to HTML with encoding incorrectly set to
# gb2312 instead of gbk. gbk is a superset of gb2312, anyway. # gb2312 instead of gbk. gbk is a superset of gb2312, anyway.
det_encoding = 'gbk' det_encoding = 'gbk'
ienc = det_encoding['encoding'] ienc = det_encoding
log.debug('Detected input encoding as %s with a confidence of %s%%' % (ienc, det_encoding['confidence'] * 100)) log.debug('Detected input encoding as %s with a confidence of %s%%' % (ienc, confidence * 100))
if not ienc: if not ienc:
ienc = 'utf-8' ienc = 'utf-8'
log.debug('No input encoding specified and could not auto detect using %s' % ienc) log.debug('No input encoding specified and could not auto detect using %s' % ienc)

View File

@ -77,7 +77,7 @@ class Plumber(object):
def __init__(self, input, output, log, report_progress=DummyReporter(), def __init__(self, input, output, log, report_progress=DummyReporter(),
dummy=False, merge_plugin_recs=True, abort_after_input_dump=False, dummy=False, merge_plugin_recs=True, abort_after_input_dump=False,
override_input_metadata=False): override_input_metadata=False, for_regex_wizard=False):
''' '''
:param input: Path to input file. :param input: Path to input file.
:param output: Path to output file/directory :param output: Path to output file/directory
@ -87,6 +87,7 @@ class Plumber(object):
if isbytestring(output): if isbytestring(output):
output = output.decode(filesystem_encoding) output = output.decode(filesystem_encoding)
self.original_input_arg = input self.original_input_arg = input
self.for_regex_wizard = for_regex_wizard
self.input = os.path.abspath(input) self.input = os.path.abspath(input)
self.output = os.path.abspath(output) self.output = os.path.abspath(output)
self.log = log self.log = log
@ -123,7 +124,7 @@ OptionRecommendation(name='input_profile',
'conversion system information on how to interpret ' 'conversion system information on how to interpret '
'various information in the input document. For ' 'various information in the input document. For '
'example resolution dependent lengths (i.e. lengths in ' 'example resolution dependent lengths (i.e. lengths in '
'pixels). Choices are:')+\ 'pixels). Choices are:')+
', '.join([x.short_name for x in input_profiles()]) ', '.join([x.short_name for x in input_profiles()])
), ),
@ -135,7 +136,7 @@ OptionRecommendation(name='output_profile',
'created document for the specified device. In some cases, ' 'created document for the specified device. In some cases, '
'an output profile is required to produce documents that ' 'an output profile is required to produce documents that '
'will work on a device. For example EPUB on the SONY reader. ' 'will work on a device. For example EPUB on the SONY reader. '
'Choices are:') + \ 'Choices are:') +
', '.join([x.short_name for x in output_profiles()]) ', '.join([x.short_name for x in output_profiles()])
), ),
@ -490,7 +491,7 @@ OptionRecommendation(name='asciiize',
'cases where there are multiple representations of a character ' 'cases where there are multiple representations of a character '
'(characters shared by Chinese and Japanese for instance) the ' '(characters shared by Chinese and Japanese for instance) the '
'representation based on the current calibre interface language will be ' 'representation based on the current calibre interface language will be '
'used.')%\ 'used.')%
u'\u041c\u0438\u0445\u0430\u0438\u043b ' u'\u041c\u0438\u0445\u0430\u0438\u043b '
u'\u0413\u043e\u0440\u0431\u0430\u0447\u0451\u0432' u'\u0413\u043e\u0440\u0431\u0430\u0447\u0451\u0432'
) )
@ -711,7 +712,6 @@ OptionRecommendation(name='search_replace',
self.input_fmt = input_fmt self.input_fmt = input_fmt
self.output_fmt = output_fmt self.output_fmt = output_fmt
self.all_format_options = set() self.all_format_options = set()
self.input_options = set() self.input_options = set()
self.output_options = set() self.output_options = set()
@ -775,7 +775,7 @@ OptionRecommendation(name='search_replace',
if not html_files: if not html_files:
raise ValueError(_('Could not find an ebook inside the archive')) raise ValueError(_('Could not find an ebook inside the archive'))
html_files = [(f, os.stat(f).st_size) for f in html_files] html_files = [(f, os.stat(f).st_size) for f in html_files]
html_files.sort(cmp = lambda x, y: cmp(x[1], y[1])) html_files.sort(cmp=lambda x, y: cmp(x[1], y[1]))
html_files = [f[0] for f in html_files] html_files = [f[0] for f in html_files]
for q in ('toc', 'index'): for q in ('toc', 'index'):
for f in html_files: for f in html_files:
@ -783,8 +783,6 @@ OptionRecommendation(name='search_replace',
return f, os.path.splitext(f)[1].lower()[1:] return f, os.path.splitext(f)[1].lower()[1:]
return html_files[-1], os.path.splitext(html_files[-1])[1].lower()[1:] return html_files[-1], os.path.splitext(html_files[-1])[1].lower()[1:]
def get_option_by_name(self, name): def get_option_by_name(self, name):
for group in (self.input_options, self.pipeline_options, for group in (self.input_options, self.pipeline_options,
self.output_options, self.all_format_options): self.output_options, self.all_format_options):
@ -956,7 +954,6 @@ OptionRecommendation(name='search_replace',
self.log.info('Input debug saved to:', out_dir) self.log.info('Input debug saved to:', out_dir)
def run(self): def run(self):
''' '''
Run the conversion pipeline Run the conversion pipeline
@ -965,10 +962,12 @@ OptionRecommendation(name='search_replace',
self.setup_options() self.setup_options()
if self.opts.verbose: if self.opts.verbose:
self.log.filter_level = self.log.DEBUG self.log.filter_level = self.log.DEBUG
if self.for_regex_wizard and hasattr(self.opts, 'no_process'):
self.opts.no_process = True
self.flush() self.flush()
import cssutils, logging import cssutils, logging
cssutils.log.setLevel(logging.WARN) cssutils.log.setLevel(logging.WARN)
get_types_map() # Ensure the mimetypes module is intialized get_types_map() # Ensure the mimetypes module is intialized
if self.opts.debug_pipeline is not None: if self.opts.debug_pipeline is not None:
self.opts.verbose = max(self.opts.verbose, 4) self.opts.verbose = max(self.opts.verbose, 4)
@ -1003,6 +1002,8 @@ OptionRecommendation(name='search_replace',
self.ui_reporter(0.01, _('Converting input to HTML...')) self.ui_reporter(0.01, _('Converting input to HTML...'))
ir = CompositeProgressReporter(0.01, 0.34, self.ui_reporter) ir = CompositeProgressReporter(0.01, 0.34, self.ui_reporter)
self.input_plugin.report_progress = ir self.input_plugin.report_progress = ir
if self.for_regex_wizard:
self.input_plugin.for_viewer = True
with self.input_plugin: with self.input_plugin:
self.oeb = self.input_plugin(stream, self.opts, self.oeb = self.input_plugin(stream, self.opts,
self.input_fmt, self.log, self.input_fmt, self.log,
@ -1014,8 +1015,12 @@ OptionRecommendation(name='search_replace',
if self.input_fmt in ('recipe', 'downloaded_recipe'): if self.input_fmt in ('recipe', 'downloaded_recipe'):
self.opts_to_mi(self.user_metadata) self.opts_to_mi(self.user_metadata)
if not hasattr(self.oeb, 'manifest'): if not hasattr(self.oeb, 'manifest'):
self.oeb = create_oebbook(self.log, self.oeb, self.opts, self.oeb = create_oebbook(
encoding=self.input_plugin.output_encoding) self.log, self.oeb, self.opts,
encoding=self.input_plugin.output_encoding,
for_regex_wizard=self.for_regex_wizard)
if self.for_regex_wizard:
return
self.input_plugin.postprocess_book(self.oeb, self.opts, self.log) self.input_plugin.postprocess_book(self.oeb, self.opts, self.log)
self.opts.is_image_collection = self.input_plugin.is_image_collection self.opts.is_image_collection = self.input_plugin.is_image_collection
pr = CompositeProgressReporter(0.34, 0.67, self.ui_reporter) pr = CompositeProgressReporter(0.34, 0.67, self.ui_reporter)
@ -1081,7 +1086,6 @@ OptionRecommendation(name='search_replace',
self.dump_oeb(self.oeb, out_dir) self.dump_oeb(self.oeb, out_dir)
self.log('Structured HTML written to:', out_dir) self.log('Structured HTML written to:', out_dir)
if self.opts.extra_css and os.path.exists(self.opts.extra_css): if self.opts.extra_css and os.path.exists(self.opts.extra_css):
self.opts.extra_css = open(self.opts.extra_css, 'rb').read() self.opts.extra_css = open(self.opts.extra_css, 'rb').read()
@ -1161,13 +1165,20 @@ OptionRecommendation(name='search_replace',
self.log(self.output_fmt.upper(), 'output written to', self.output) self.log(self.output_fmt.upper(), 'output written to', self.output)
self.flush() self.flush()
# This has to be global as create_oebbook can be called from other locations
# (for example in the html input plugin)
regex_wizard_callback = None
def set_regex_wizard_callback(f):
global regex_wizard_callback
regex_wizard_callback = f
def create_oebbook(log, path_or_stream, opts, reader=None, def create_oebbook(log, path_or_stream, opts, reader=None,
encoding='utf-8', populate=True): encoding='utf-8', populate=True, for_regex_wizard=False):
''' '''
Create an OEBBook. Create an OEBBook.
''' '''
from calibre.ebooks.oeb.base import OEBBook from calibre.ebooks.oeb.base import OEBBook
html_preprocessor = HTMLPreProcessor(log, opts) html_preprocessor = HTMLPreProcessor(log, opts, regex_wizard_callback=regex_wizard_callback)
if not encoding: if not encoding:
encoding = None encoding = None
oeb = OEBBook(log, html_preprocessor, oeb = OEBBook(log, html_preprocessor,
@ -1182,3 +1193,4 @@ def create_oebbook(log, path_or_stream, opts, reader=None,
reader()(oeb, path_or_stream) reader()(oeb, path_or_stream)
return oeb return oeb

View File

@ -14,7 +14,7 @@ SVG_NS = 'http://www.w3.org/2000/svg'
XLINK_NS = 'http://www.w3.org/1999/xlink' XLINK_NS = 'http://www.w3.org/1999/xlink'
convert_entities = functools.partial(entity_to_unicode, convert_entities = functools.partial(entity_to_unicode,
result_exceptions = { result_exceptions={
u'<' : '&lt;', u'<' : '&lt;',
u'>' : '&gt;', u'>' : '&gt;',
u"'" : '&apos;', u"'" : '&apos;',
@ -144,9 +144,9 @@ class DocAnalysis(object):
percent is the percentage of lines that should be in a single bucket to return true percent is the percentage of lines that should be in a single bucket to return true
The majority of the lines will exist in 1-2 buckets in typical docs with hard line breaks The majority of the lines will exist in 1-2 buckets in typical docs with hard line breaks
''' '''
minLineLength=20 # Ignore lines under 20 chars (typical of spaces) minLineLength=20 # Ignore lines under 20 chars (typical of spaces)
maxLineLength=1900 # Discard larger than this to stay in range maxLineLength=1900 # Discard larger than this to stay in range
buckets=20 # Each line is divided into a bucket based on length buckets=20 # Each line is divided into a bucket based on length
#print "there are "+str(len(lines))+" lines" #print "there are "+str(len(lines))+" lines"
#max = 0 #max = 0
@ -156,7 +156,7 @@ class DocAnalysis(object):
# max = l # max = l
#print "max line found is "+str(max) #print "max line found is "+str(max)
# Build the line length histogram # Build the line length histogram
hRaw = [ 0 for i in range(0,buckets) ] hRaw = [0 for i in range(0,buckets)]
for line in self.lines: for line in self.lines:
l = len(line) l = len(line)
if l > minLineLength and l < maxLineLength: if l > minLineLength and l < maxLineLength:
@ -167,7 +167,7 @@ class DocAnalysis(object):
# Normalize the histogram into percents # Normalize the histogram into percents
totalLines = len(self.lines) totalLines = len(self.lines)
if totalLines > 0: if totalLines > 0:
h = [ float(count)/totalLines for count in hRaw ] h = [float(count)/totalLines for count in hRaw]
else: else:
h = [] h = []
#print "\nhRaw histogram lengths are: "+str(hRaw) #print "\nhRaw histogram lengths are: "+str(hRaw)
@ -200,7 +200,7 @@ class Dehyphenator(object):
# Add common suffixes to the regex below to increase the likelihood of a match - # Add common suffixes to the regex below to increase the likelihood of a match -
# don't add suffixes which are also complete words, such as 'able' or 'sex' # don't add suffixes which are also complete words, such as 'able' or 'sex'
# only remove if it's not already the point of hyphenation # only remove if it's not already the point of hyphenation
self.suffix_string = "((ed)?ly|'?e?s||a?(t|s)?ion(s|al(ly)?)?|ings?|er|(i)?ous|(i|a)ty|(it)?ies|ive|gence|istic(ally)?|(e|a)nce|m?ents?|ism|ated|(e|u)ct(ed)?|ed|(i|ed)?ness|(e|a)ncy|ble|ier|al|ex|ian)$" self.suffix_string = "((ed)?ly|'?e?s||a?(t|s)?ion(s|al(ly)?)?|ings?|er|(i)?ous|(i|a)ty|(it)?ies|ive|gence|istic(ally)?|(e|a)nce|m?ents?|ism|ated|(e|u)ct(ed)?|ed|(i|ed)?ness|(e|a)ncy|ble|ier|al|ex|ian)$" # noqa
self.suffixes = re.compile(r"^%s" % self.suffix_string, re.IGNORECASE) self.suffixes = re.compile(r"^%s" % self.suffix_string, re.IGNORECASE)
self.removesuffixes = re.compile(r"%s" % self.suffix_string, re.IGNORECASE) self.removesuffixes = re.compile(r"%s" % self.suffix_string, re.IGNORECASE)
# remove prefixes if the prefix was not already the point of hyphenation # remove prefixes if the prefix was not already the point of hyphenation
@ -265,19 +265,18 @@ class Dehyphenator(object):
self.html = html self.html = html
self.format = format self.format = format
if format == 'html': if format == 'html':
intextmatch = re.compile(u'(?<=.{%i})(?P<firstpart>[^\W\-]+)(-|)\s*(?=<)(?P<wraptags>(</span>)?\s*(</[iubp]>\s*){1,2}(?P<up2threeblanks><(p|div)[^>]*>\s*(<p[^>]*>\s*</p>\s*)?</(p|div)>\s+){0,3}\s*(<[iubp][^>]*>\s*){1,2}(<span[^>]*>)?)\s*(?P<secondpart>[\w\d]+)' % length) intextmatch = re.compile(u'(?<=.{%i})(?P<firstpart>[^\W\-]+)(-|)\s*(?=<)(?P<wraptags>(</span>)?\s*(</[iubp]>\s*){1,2}(?P<up2threeblanks><(p|div)[^>]*>\s*(<p[^>]*>\s*</p>\s*)?</(p|div)>\s+){0,3}\s*(<[iubp][^>]*>\s*){1,2}(<span[^>]*>)?)\s*(?P<secondpart>[\w\d]+)' % length) # noqa
elif format == 'pdf': elif format == 'pdf':
intextmatch = re.compile(u'(?<=.{%i})(?P<firstpart>[^\W\-]+)(-|)\s*(?P<wraptags><p>|</[iub]>\s*<p>\s*<[iub]>)\s*(?P<secondpart>[\w\d]+)'% length) intextmatch = re.compile(u'(?<=.{%i})(?P<firstpart>[^\W\-]+)(-|)\s*(?P<wraptags><p>|</[iub]>\s*<p>\s*<[iub]>)\s*(?P<secondpart>[\w\d]+)'% length)
elif format == 'txt': elif format == 'txt':
intextmatch = re.compile(u'(?<=.{%i})(?P<firstpart>[^\W\-]+)(-|)(\u0020|\u0009)*(?P<wraptags>(\n(\u0020|\u0009)*)+)(?P<secondpart>[\w\d]+)'% length) intextmatch = re.compile(u'(?<=.{%i})(?P<firstpart>[^\W\-]+)(-|)(\u0020|\u0009)*(?P<wraptags>(\n(\u0020|\u0009)*)+)(?P<secondpart>[\w\d]+)'% length) # noqa
elif format == 'individual_words': elif format == 'individual_words':
intextmatch = re.compile(u'(?!<)(?P<firstpart>[^\W\-]+)(-|)\s*(?P<secondpart>\w+)(?![^<]*?>)') intextmatch = re.compile(u'(?!<)(?P<firstpart>[^\W\-]+)(-|)\s*(?P<secondpart>\w+)(?![^<]*?>)')
elif format == 'html_cleanup': elif format == 'html_cleanup':
intextmatch = re.compile(u'(?P<firstpart>[^\W\-]+)(-|)\s*(?=<)(?P<wraptags></span>\s*(</[iubp]>\s*<[iubp][^>]*>\s*)?<span[^>]*>|</[iubp]>\s*<[iubp][^>]*>)?\s*(?P<secondpart>[\w\d]+)') intextmatch = re.compile(u'(?P<firstpart>[^\W\-]+)(-|)\s*(?=<)(?P<wraptags></span>\s*(</[iubp]>\s*<[iubp][^>]*>\s*)?<span[^>]*>|</[iubp]>\s*<[iubp][^>]*>)?\s*(?P<secondpart>[\w\d]+)') # noqa
elif format == 'txt_cleanup': elif format == 'txt_cleanup':
intextmatch = re.compile(u'(?P<firstpart>[^\W\-]+)(-|)(?P<wraptags>\s+)(?P<secondpart>[\w\d]+)') intextmatch = re.compile(u'(?P<firstpart>[^\W\-]+)(-|)(?P<wraptags>\s+)(?P<secondpart>[\w\d]+)')
html = intextmatch.sub(self.dehyphenate, html) html = intextmatch.sub(self.dehyphenate, html)
return html return html
@ -498,9 +497,11 @@ class HTMLPreProcessor(object):
(re.compile('<span[^><]*?id=subtitle[^><]*?>(.*?)</span>', re.IGNORECASE|re.DOTALL), (re.compile('<span[^><]*?id=subtitle[^><]*?>(.*?)</span>', re.IGNORECASE|re.DOTALL),
lambda match : '<h3 class="subtitle">%s</h3>'%(match.group(1),)), lambda match : '<h3 class="subtitle">%s</h3>'%(match.group(1),)),
] ]
def __init__(self, log=None, extra_opts=None): def __init__(self, log=None, extra_opts=None, regex_wizard_callback=None):
self.log = log self.log = log
self.extra_opts = extra_opts self.extra_opts = extra_opts
self.regex_wizard_callback = regex_wizard_callback
self.current_href = None
def is_baen(self, src): def is_baen(self, src):
return re.compile(r'<meta\s+name="Publisher"\s+content=".*?Baen.*?"', return re.compile(r'<meta\s+name="Publisher"\s+content=".*?Baen.*?"',
@ -581,12 +582,15 @@ class HTMLPreProcessor(object):
end_rules.append((re.compile(u'(?<=.{%i}[–—])\s*<p>\s*(?=[[a-z\d])' % length), lambda match: '')) end_rules.append((re.compile(u'(?<=.{%i}[–—])\s*<p>\s*(?=[[a-z\d])' % length), lambda match: ''))
end_rules.append( end_rules.append(
# Un wrap using punctuation # Un wrap using punctuation
(re.compile(u'(?<=.{%i}([a-zäëïöüàèìòùáćéíĺóŕńśúýâêîôûçąężıãõñæøþðßěľščťžňďřů,:)\IA\u00DF]|(?<!\&\w{4});))\s*(?P<ital></(i|b|u)>)?\s*(</p>\s*<p>\s*)+\s*(?=(<(i|b|u)>)?\s*[\w\d$(])' % length, re.UNICODE), wrap_lines), (re.compile(u'(?<=.{%i}([a-zäëïöüàèìòùáćéíĺóŕńśúýâêîôûçąężıãõñæøþðßěľščťžňďřů,:)\IA\u00DF]|(?<!\&\w{4});))\s*(?P<ital></(i|b|u)>)?\s*(</p>\s*<p>\s*)+\s*(?=(<(i|b|u)>)?\s*[\w\d$(])' % length, re.UNICODE), wrap_lines), # noqa
) )
for rule in self.PREPROCESS + start_rules: for rule in self.PREPROCESS + start_rules:
html = rule[0].sub(rule[1], html) html = rule[0].sub(rule[1], html)
if self.regex_wizard_callback is not None:
self.regex_wizard_callback(self.current_href, html)
if get_preprocess_html: if get_preprocess_html:
return html return html

View File

@ -0,0 +1,371 @@
#!/usr/bin/env python
# vim:fileencoding=utf-8
from __future__ import (unicode_literals, division, absolute_import,
print_function)
__license__ = 'GPL v3'
__copyright__ = '2013, Kovid Goyal <kovid at kovidgoyal.net>'
from collections import OrderedDict
from calibre.ebooks.docx.names import XPath, get
class Inherit:
pass
inherit = Inherit()
def binary_property(parent, name):
vals = XPath('./w:%s' % name)(parent)
if not vals:
return inherit
val = get(vals[0], 'w:val', 'on')
return True if val in {'on', '1', 'true'} else False
def simple_color(col, auto='black'):
if not col or col == 'auto' or len(col) != 6:
return auto
return '#'+col
def simple_float(val, mult=1.0):
try:
return float(val) * mult
except (ValueError, TypeError, AttributeError, KeyError):
return None
LINE_STYLES = { # {{{
'basicBlackDashes': 'dashed',
'basicBlackDots': 'dotted',
'basicBlackSquares': 'dashed',
'basicThinLines': 'solid',
'dashDotStroked': 'groove',
'dashed': 'dashed',
'dashSmallGap': 'dashed',
'dotDash': 'dashed',
'dotDotDash': 'dashed',
'dotted': 'dotted',
'double': 'double',
'inset': 'inset',
'nil': 'none',
'none': 'none',
'outset': 'outset',
'single': 'solid',
'thick': 'solid',
'thickThinLargeGap': 'double',
'thickThinMediumGap': 'double',
'thickThinSmallGap' : 'double',
'thinThickLargeGap': 'double',
'thinThickMediumGap': 'double',
'thinThickSmallGap': 'double',
'thinThickThinLargeGap': 'double',
'thinThickThinMediumGap': 'double',
'thinThickThinSmallGap': 'double',
'threeDEmboss': 'ridge',
'threeDEngrave': 'groove',
'triple': 'double',
} # }}}
# Read from XML {{{
def read_border(parent, dest):
tvals = {'padding_%s':inherit, 'border_%s_width':inherit,
'border_%s_style':inherit, 'border_%s_color':inherit}
vals = {}
for edge in ('left', 'top', 'right', 'bottom'):
vals.update({k % edge:v for k, v in tvals.iteritems()})
for border in XPath('./w:pBdr')(parent):
for edge in ('left', 'top', 'right', 'bottom'):
for elem in XPath('./w:%s' % edge)(border):
color = get(elem, 'w:color')
if color is not None:
vals['border_%s_color' % edge] = simple_color(color)
style = get(elem, 'w:val')
if style is not None:
vals['border_%s_style' % edge] = LINE_STYLES.get(style, 'solid')
space = get(elem, 'w:space')
if space is not None:
try:
vals['padding_%s' % edge] = float(space)
except (ValueError, TypeError):
pass
sz = get(elem, 'w:sz')
if sz is not None:
# we dont care about art borders (they are only used for page borders)
try:
vals['border_%s_width' % edge] = min(96, max(2, float(sz))) / 8
except (ValueError, TypeError):
pass
for key, val in vals.iteritems():
setattr(dest, key, val)
def read_indent(parent, dest):
padding_left = padding_right = text_indent = inherit
for indent in XPath('./w:ind')(parent):
l, lc = get(indent, 'w:left'), get(indent, 'w:leftChars')
pl = simple_float(lc, 0.01) if lc is not None else simple_float(l, 0.05) if l is not None else None
if pl is not None:
padding_left = '%.3g%s' % (pl, 'em' if lc is not None else 'pt')
r, rc = get(indent, 'w:right'), get(indent, 'w:rightChars')
pr = simple_float(rc, 0.01) if rc is not None else simple_float(r, 0.05) if r is not None else None
if pr is not None:
padding_right = '%.3g%s' % (pr, 'em' if rc is not None else 'pt')
h, hc = get(indent, 'w:hanging'), get(indent, 'w:hangingChars')
fl, flc = get(indent, 'w:firstLine'), get(indent, 'w:firstLineChars')
h = h if h is None else '-'+h
hc = hc if hc is None else '-'+hc
ti = (simple_float(hc, 0.01) if hc is not None else simple_float(h, 0.05) if h is not None else
simple_float(flc, 0.01) if flc is not None else simple_float(fl, 0.05) if fl is not None else None)
if ti is not None:
text_indent = '%.3g%s' % (ti, 'em' if hc is not None or (h is None and flc is not None) else 'pt')
setattr(dest, 'margin_left', padding_left)
setattr(dest, 'margin_right', padding_right)
setattr(dest, 'text_indent', text_indent)
def read_justification(parent, dest):
ans = inherit
for jc in XPath('./w:jc[@w:val]')(parent):
val = get(jc, 'w:val')
if not val:
continue
if val in {'both', 'distribute'} or 'thai' in val or 'kashida' in val:
ans = 'justify'
if val in {'left', 'center', 'right',}:
ans = val
setattr(dest, 'text_align', ans)
def read_spacing(parent, dest):
padding_top = padding_bottom = line_height = inherit
for s in XPath('./w:spacing')(parent):
a, al, aa = get(s, 'w:after'), get(s, 'w:afterLines'), get(s, 'w:afterAutospacing')
pb = None if aa in {'on', '1', 'true'} else simple_float(al, 0.02) if al is not None else simple_float(a, 0.05) if a is not None else None
if pb is not None:
padding_bottom = '%.3g%s' % (pb, 'ex' if al is not None else 'pt')
b, bl, bb = get(s, 'w:before'), get(s, 'w:beforeLines'), get(s, 'w:beforeAutospacing')
pt = None if bb in {'on', '1', 'true'} else simple_float(bl, 0.02) if bl is not None else simple_float(b, 0.05) if b is not None else None
if pt is not None:
padding_top = '%.3g%s' % (pt, 'ex' if bl is not None else 'pt')
l, lr = get(s, 'w:line'), get(s, 'w:lineRule', 'auto')
if l is not None:
lh = simple_float(l, 0.05) if lr in {'exact', 'atLeast'} else simple_float(l, 1/240.0)
line_height = '%.3g%s' % (lh, 'pt' if lr in {'exact', 'atLeast'} else '')
setattr(dest, 'margin_top', padding_top)
setattr(dest, 'margin_bottom', padding_bottom)
setattr(dest, 'line_height', line_height)
def read_direction(parent, dest):
ans = inherit
for jc in XPath('./w:textFlow[@w:val]')(parent):
val = get(jc, 'w:val')
if not val:
continue
if 'rl' in val.lower():
ans = 'rtl'
setattr(dest, 'direction', ans)
def read_shd(parent, dest):
ans = inherit
for shd in XPath('./w:shd[@w:fill]')(parent):
val = get(shd, 'w:fill')
if val:
ans = simple_color(val, auto='transparent')
setattr(dest, 'background_color', ans)
def read_numbering(parent, dest):
lvl = num_id = None
for np in XPath('./w:numPr')(parent):
for ilvl in XPath('./w:ilvl[@w:val]')(np):
try:
lvl = int(get(ilvl, 'w:val'))
except (ValueError, TypeError):
pass
for num in XPath('./w:numId[@w:val]')(np):
num_id = get(num, 'w:val')
val = (num_id, lvl) if num_id is not None or lvl is not None else inherit
setattr(dest, 'numbering', val)
class Frame(object):
all_attributes = ('drop_cap', 'h', 'w', 'h_anchor', 'h_rule', 'v_anchor', 'wrap',
'h_space', 'v_space', 'lines', 'x_align', 'y_align', 'x', 'y')
def __init__(self, fp):
self.drop_cap = get(fp, 'w:dropCap', 'none')
try:
self.h = int(get(fp, 'w:h'))/20
except (ValueError, TypeError):
self.h = 0
try:
self.w = int(get(fp, 'w:w'))/20
except (ValueError, TypeError):
self.w = None
try:
self.x = int(get(fp, 'w:x'))/20
except (ValueError, TypeError):
self.x = 0
try:
self.y = int(get(fp, 'w:y'))/20
except (ValueError, TypeError):
self.y = 0
self.h_anchor = get(fp, 'w:hAnchor', 'page')
self.h_rule = get(fp, 'w:hRule', 'auto')
self.v_anchor = get(fp, 'w:vAnchor', 'page')
self.wrap = get(fp, 'w:wrap', 'around')
self.x_align = get(fp, 'w:xAlign')
self.y_align = get(fp, 'w:yAlign')
try:
self.h_space = int(get(fp, 'w:hSpace'))/20
except (ValueError, TypeError):
self.h_space = 0
try:
self.v_space = int(get(fp, 'w:vSpace'))/20
except (ValueError, TypeError):
self.v_space = 0
try:
self.lines = int(get(fp, 'w:lines'))
except (ValueError, TypeError):
self.lines = 1
def css(self, page):
is_dropcap = self.drop_cap in {'drop', 'margin'}
ans = {'overflow': 'hidden'}
if is_dropcap:
ans['float'] = 'left'
ans['margin'] = '0'
ans['padding-right'] = '0.2em'
else:
if self.h_rule != 'auto':
t = 'min-height' if self.h_rule == 'atLeast' else 'height'
ans[t] = '%.3gpt' % self.h
if self.w is not None:
ans['width'] = '%.3gpt' % self.w
ans['padding-top'] = ans['padding-bottom'] = '%.3gpt' % self.v_space
if self.wrap not in {None, 'none'}:
ans['padding-left'] = ans['padding-right'] = '%.3gpt' % self.h_space
if self.x_align is None:
fl = 'left' if self.x/page.width < 0.5 else 'right'
else:
fl = 'right' if self.x_align == 'right' else 'left'
ans['float'] = fl
return ans
def __eq__(self, other):
for x in self.all_attributes:
if getattr(other, x, inherit) != getattr(self, x):
return False
return True
def __ne__(self, other):
return not self.__eq__(other)
def read_frame(parent, dest):
ans = inherit
for fp in XPath('./w:framePr')(parent):
ans = Frame(fp)
setattr(dest, 'frame', ans)
# }}}
class ParagraphStyle(object):
all_properties = (
'adjustRightInd', 'autoSpaceDE', 'autoSpaceDN', 'bidi',
'contextualSpacing', 'keepLines', 'keepNext', 'mirrorIndents',
'pageBreakBefore', 'snapToGrid', 'suppressLineNumbers',
'suppressOverlap', 'topLinePunct', 'widowControl', 'wordWrap',
# Border margins padding
'border_left_width', 'border_left_style', 'border_left_color', 'padding_left',
'border_top_width', 'border_top_style', 'border_top_color', 'padding_top',
'border_right_width', 'border_right_style', 'border_right_color', 'padding_right',
'border_bottom_width', 'border_bottom_style', 'border_bottom_color', 'padding_bottom',
'margin_left', 'margin_top', 'margin_right', 'margin_bottom',
# Misc.
'text_indent', 'text_align', 'line_height', 'direction', 'background_color',
'numbering', 'font_family', 'font_size', 'frame',
)
def __init__(self, pPr=None):
self.linked_style = None
if pPr is None:
for p in self.all_properties:
setattr(self, p, inherit)
else:
for p in (
'adjustRightInd', 'autoSpaceDE', 'autoSpaceDN', 'bidi',
'contextualSpacing', 'keepLines', 'keepNext', 'mirrorIndents',
'pageBreakBefore', 'snapToGrid', 'suppressLineNumbers',
'suppressOverlap', 'topLinePunct', 'widowControl', 'wordWrap',
):
setattr(self, p, binary_property(pPr, p))
for x in ('border', 'indent', 'justification', 'spacing', 'direction', 'shd', 'numbering', 'frame'):
f = globals()['read_%s' % x]
f(pPr, self)
for s in XPath('./w:pStyle[@w:val]')(pPr):
self.linked_style = get(s, 'w:val')
self.font_family = self.font_size = inherit
self._css = None
def update(self, other):
for prop in self.all_properties:
nval = getattr(other, prop)
if nval is not inherit:
setattr(self, prop, nval)
if other.linked_style is not None:
self.linked_style = other.linked_style
def resolve_based_on(self, parent):
for p in self.all_properties:
val = getattr(self, p)
if val is inherit:
setattr(self, p, getattr(parent, p))
@property
def css(self):
if self._css is None:
self._css = c = OrderedDict()
if self.keepLines is True:
c['page-break-inside'] = 'avoid'
if self.pageBreakBefore is True:
c['page-break-before'] = 'always'
for edge in ('left', 'top', 'right', 'bottom'):
val = getattr(self, 'border_%s_width' % edge)
if val is not inherit:
c['border-left-width'] = '%.3gpt' % val
for x in ('style', 'color'):
val = getattr(self, 'border_%s_%s' % (edge, x))
if val is not inherit:
c['border-%s-%s' % (edge, x)] = val
val = getattr(self, 'padding_%s' % edge)
if val is not inherit:
c['padding-%s' % edge] = '%.3gpt' % val
val = getattr(self, 'margin_%s' % edge)
if val is not inherit:
c['margin-%s' % edge] = val
if self.line_height not in {inherit, '1'}:
c['line-height'] = self.line_height
for x in ('text_indent', 'text_align', 'background_color', 'font_family', 'font_size'):
val = getattr(self, x)
if val is not inherit:
if x == 'font_size':
val = '%.3gpt' % val
c[x.replace('_', '-')] = val
return self._css
# TODO: keepNext must be done at markup level

View File

@ -0,0 +1,249 @@
#!/usr/bin/env python
# vim:fileencoding=utf-8
from __future__ import (unicode_literals, division, absolute_import,
print_function)
__license__ = 'GPL v3'
__copyright__ = '2013, Kovid Goyal <kovid at kovidgoyal.net>'
from collections import OrderedDict
from calibre.ebooks.docx.block_styles import ( # noqa
inherit, simple_color, LINE_STYLES, simple_float, binary_property, read_shd)
from calibre.ebooks.docx.names import XPath, get
# Read from XML {{{
def read_text_border(parent, dest):
border_color = border_style = border_width = padding = inherit
elems = XPath('./w:bdr')(parent)
if elems:
border_color = simple_color('auto')
border_style = 'solid'
border_width = 1
for elem in elems:
color = get(elem, 'w:color')
if color is not None:
border_color = simple_color(color)
style = get(elem, 'w:val')
if style is not None:
border_style = LINE_STYLES.get(style, 'solid')
space = get(elem, 'w:space')
if space is not None:
try:
padding = float(space)
except (ValueError, TypeError):
pass
sz = get(elem, 'w:sz')
if sz is not None:
# we dont care about art borders (they are only used for page borders)
try:
border_width = min(96, max(2, float(sz))) / 8
except (ValueError, TypeError):
pass
setattr(dest, 'border_color', border_color)
setattr(dest, 'border_style', border_style)
setattr(dest, 'border_width', border_width)
setattr(dest, 'padding', padding)
def read_color(parent, dest):
ans = inherit
for col in XPath('./w:color[@w:val]')(parent):
val = get(col, 'w:val')
if not val:
continue
ans = simple_color(val)
setattr(dest, 'color', ans)
def read_highlight(parent, dest):
ans = inherit
for col in XPath('./w:highlight[@w:val]')(parent):
val = get(col, 'w:val')
if not val:
continue
if not val or val == 'none':
val = 'transparent'
ans = val
setattr(dest, 'highlight', ans)
def read_lang(parent, dest):
ans = inherit
for col in XPath('./w:lang[@w:val]')(parent):
val = get(col, 'w:val')
if not val:
continue
try:
code = int(val, 16)
except (ValueError, TypeError):
ans = val
else:
from calibre.ebooks.docx.lcid import lcid
val = lcid.get(code, None)
if val:
ans = val
setattr(dest, 'lang', ans)
def read_letter_spacing(parent, dest):
ans = inherit
for col in XPath('./w:spacing[@w:val]')(parent):
val = simple_float(get(col, 'w:val'), 0.05)
if val is not None:
ans = val
setattr(dest, 'letter_spacing', ans)
def read_sz(parent, dest):
ans = inherit
for col in XPath('./w:sz[@w:val]')(parent):
val = simple_float(get(col, 'w:val'), 0.5)
if val is not None:
ans = val
setattr(dest, 'font_size', ans)
def read_underline(parent, dest):
ans = inherit
for col in XPath('./w:u[@w:val]')(parent):
val = get(col, 'w:val')
if val:
ans = 'underline'
setattr(dest, 'text_decoration', ans)
def read_vert_align(parent, dest):
ans = inherit
for col in XPath('./w:vertAlign[@w:val]')(parent):
val = get(col, 'w:val')
if val and val in {'baseline', 'subscript', 'superscript'}:
ans = val
setattr(dest, 'vert_align', ans)
def read_font_family(parent, dest):
ans = inherit
for col in XPath('./w:rFonts[@w:ascii]')(parent):
val = get(col, 'w:ascii')
if val:
ans = val
setattr(dest, 'font_family', ans)
# }}}
class RunStyle(object):
all_properties = {
'b', 'bCs', 'caps', 'cs', 'dstrike', 'emboss', 'i', 'iCs', 'imprint',
'rtl', 'shadow', 'smallCaps', 'strike', 'vanish',
'border_color', 'border_style', 'border_width', 'padding', 'color', 'highlight', 'background_color',
'letter_spacing', 'font_size', 'text_decoration', 'vert_align', 'lang', 'font_family'
}
toggle_properties = {
'b', 'bCs', 'caps', 'emboss', 'i', 'iCs', 'imprint', 'shadow', 'smallCaps', 'strike', 'dstrike', 'vanish',
}
def __init__(self, rPr=None):
self.linked_style = None
if rPr is None:
for p in self.all_properties:
setattr(self, p, inherit)
else:
for p in (
'b', 'bCs', 'caps', 'cs', 'dstrike', 'emboss', 'i', 'iCs', 'imprint', 'rtl', 'shadow',
'smallCaps', 'strike', 'vanish',
):
setattr(self, p, binary_property(rPr, p))
for x in ('text_border', 'color', 'highlight', 'shd', 'letter_spacing', 'sz', 'underline', 'vert_align', 'lang', 'font_family'):
f = globals()['read_%s' % x]
f(rPr, self)
for s in XPath('./w:rStyle[@w:val]')(rPr):
self.linked_style = get(s, 'w:val')
self._css = None
def update(self, other):
for prop in self.all_properties:
nval = getattr(other, prop)
if nval is not inherit:
setattr(self, prop, nval)
if other.linked_style is not None:
self.linked_style = other.linked_style
def resolve_based_on(self, parent):
for p in self.all_properties:
val = getattr(self, p)
if val is inherit:
setattr(self, p, getattr(parent, p))
def get_border_css(self, ans):
for x in ('color', 'style', 'width'):
val = getattr(self, 'border_'+x)
if x == 'width' and val is not inherit:
val = '%.3gpt' % val
if val is not inherit:
ans['border-%s' % x] = val
def clear_border_css(self):
for x in ('color', 'style', 'width'):
setattr(self, 'border_'+x, inherit)
@property
def css(self):
if self._css is None:
c = self._css = OrderedDict()
td = set()
if self.text_decoration is not inherit:
td.add(self.text_decoration)
if self.strike:
td.add('line-through')
if self.dstrike:
td.add('line-through')
if td:
c['text-decoration'] = ' '.join(td)
if self.caps is True:
c['text-transform'] = 'uppercase'
if self.i is True:
c['font-style'] = 'italic'
if self.shadow:
c['text-shadow'] = '2px 2px'
if self.smallCaps is True:
c['font-variant'] = 'small-caps'
if self.vanish is True:
c['display'] = 'none'
self.get_border_css(c)
if self.padding is not inherit:
c['padding'] = '%.3gpt' % self.padding
for x in ('color', 'background_color'):
val = getattr(self, x)
if val is not inherit:
c[x.replace('_', '-')] = val
for x in ('letter_spacing', 'font_size'):
val = getattr(self, x)
if val is not inherit:
c[x.replace('_', '-')] = '%.3gpt' % val
if self.highlight is not inherit and self.highlight != 'transparent':
c['background-color'] = self.highlight
if self.b:
c['font-weight'] = 'bold'
if self.font_family is not inherit:
c['font-family'] = self.font_family
return self._css
def same_border(self, other):
for x in (self, other):
has_border = False
for y in ('color', 'style', 'width'):
if ('border-%s' % y) in x.css:
has_border = True
break
if not has_border:
return False
s = tuple(self.css.get('border-%s' % y, None) for y in ('color', 'style', 'width'))
o = tuple(other.css.get('border-%s' % y, None) for y in ('color', 'style', 'width'))
return s == o

View File

@ -167,7 +167,9 @@ class DOCX(object):
@property @property
def document_relationships(self): def document_relationships(self):
name = self.document_name return self.get_relationships(self.document_name)
def get_relationships(self, name):
base = '/'.join(name.split('/')[:-1]) base = '/'.join(name.split('/')[:-1])
by_id, by_type = {}, {} by_id, by_type = {}, {}
parts = name.split('/') parts = name.split('/')
@ -179,7 +181,9 @@ class DOCX(object):
else: else:
root = fromstring(raw) root = fromstring(raw)
for item in root.xpath('//*[local-name()="Relationships"]/*[local-name()="Relationship" and @Type and @Target]'): for item in root.xpath('//*[local-name()="Relationships"]/*[local-name()="Relationship" and @Type and @Target]'):
target = '/'.join((base, item.get('Target').lstrip('/'))) target = item.get('Target')
if item.get('TargetMode', None) != 'External':
target = '/'.join((base, target.lstrip('/')))
typ = item.get('Type') typ = item.get('Type')
Id = item.get('Id') Id = item.get('Id')
by_id[Id] = by_type[typ] = target by_id[Id] = by_type[typ] = target

View File

@ -0,0 +1,37 @@
#!/usr/bin/env python
# vim:fileencoding=utf-8
from __future__ import (unicode_literals, division, absolute_import,
print_function)
__license__ = 'GPL v3'
__copyright__ = '2013, Kovid Goyal <kovid at kovidgoyal.net>'
import sys, os, shutil
from lxml import etree
from calibre import walk
from calibre.utils.zipfile import ZipFile
def dump(path):
dest = os.path.splitext(os.path.basename(path))[0]
dest += '_extracted'
if os.path.exists(dest):
shutil.rmtree(dest)
with ZipFile(path) as zf:
zf.extractall(dest)
for f in walk(dest):
if f.endswith('.xml') or f.endswith('.rels'):
with open(f, 'r+b') as stream:
raw = stream.read()
root = etree.fromstring(raw)
stream.seek(0)
stream.truncate()
stream.write(etree.tostring(root, pretty_print=True, encoding='utf-8', xml_declaration=True))
print (path, 'dumped to', dest)
if __name__ == '__main__':
dump(sys.argv[-1])

View File

@ -0,0 +1,132 @@
#!/usr/bin/env python
# vim:fileencoding=utf-8
from __future__ import (unicode_literals, division, absolute_import,
print_function)
__license__ = 'GPL v3'
__copyright__ = '2013, Kovid Goyal <kovid at kovidgoyal.net>'
import os, re
from collections import namedtuple
from calibre.ebooks.docx.block_styles import binary_property, inherit
from calibre.ebooks.docx.names import XPath, get
from calibre.utils.filenames import ascii_filename
from calibre.utils.fonts.scanner import font_scanner, NoFonts
from calibre.utils.fonts.utils import panose_to_css_generic_family, is_truetype_font
Embed = namedtuple('Embed', 'name key subsetted')
def has_system_fonts(name):
try:
return bool(font_scanner.fonts_for_family(name))
except NoFonts:
return False
def get_variant(bold=False, italic=False):
return {(False, False):'Regular', (False, True):'Italic',
(True, False):'Bold', (True, True):'BoldItalic'}[(bold, italic)]
class Family(object):
def __init__(self, elem, embed_relationships):
self.name = self.family_name = get(elem, 'w:name')
self.alt_names = tuple(get(x, 'w:val') for x in XPath('./w:altName')(elem))
if self.alt_names and not has_system_fonts(self.name):
for x in self.alt_names:
if has_system_fonts(x):
self.family_name = x
break
self.embedded = {}
for x in ('Regular', 'Bold', 'Italic', 'BoldItalic'):
for y in XPath('./w:embed%s[@r:id]' % x)(elem):
rid = get(y, 'r:id')
key = get(y, 'w:fontKey')
subsetted = get(y, 'w:subsetted') in {'1', 'true', 'on'}
if rid in embed_relationships:
self.embedded[x] = Embed(embed_relationships[rid], key, subsetted)
self.generic_family = 'auto'
for x in XPath('./w:family[@w:val]')(elem):
self.generic_family = get(x, 'w:val', 'auto')
ntt = binary_property(elem, 'notTrueType')
self.is_ttf = ntt is inherit or not ntt
self.panose1 = None
self.panose_name = None
for x in XPath('./w:panose1[@w:val]')(elem):
try:
v = get(x, 'w:val')
v = tuple(int(v[i:i+2], 16) for i in xrange(0, len(v), 2))
except (TypeError, ValueError, IndexError):
pass
else:
self.panose1 = v
self.panose_name = panose_to_css_generic_family(v)
self.css_generic_family = {'roman':'serif', 'swiss':'sans-serif', 'modern':'monospace',
'decorative':'fantasy', 'script':'cursive'}.get(self.generic_family, None)
self.css_generic_family = self.css_generic_family or self.panose_name or 'serif'
class Fonts(object):
def __init__(self):
self.fonts = {}
self.used = set()
def __call__(self, root, embed_relationships, docx, dest_dir):
for elem in XPath('//w:font[@w:name]')(root):
self.fonts[get(elem, 'w:name')] = Family(elem, embed_relationships)
def family_for(self, name, bold=False, italic=False):
f = self.fonts.get(name, None)
if f is None:
return 'serif'
variant = get_variant(bold, italic)
self.used.add((name, variant))
name = f.name if variant in f.embedded else f.family_name
return '"%s", %s' % (name.replace('"', ''), f.css_generic_family)
def embed_fonts(self, dest_dir, docx):
defs = []
dest_dir = os.path.join(dest_dir, 'fonts')
for name, variant in self.used:
f = self.fonts[name]
if variant in f.embedded:
if not os.path.exists(dest_dir):
os.mkdir(dest_dir)
fname = self.write(name, dest_dir, docx, variant)
if fname is not None:
d = {'font-family':'"%s"' % name.replace('"', ''), 'src': 'url("fonts/%s")' % fname}
if 'Bold' in variant:
d['font-weight'] = 'bold'
if 'Italic' in variant:
d['font-style'] = 'italic'
d = ['%s: %s' % (k, v) for k, v in d.iteritems()]
d = ';\n\t'.join(d)
defs.append('@font-face {\n\t%s\n}\n' % d)
return '\n'.join(defs)
def write(self, name, dest_dir, docx, variant):
f = self.fonts[name]
ef = f.embedded[variant]
raw = docx.read(ef.name)
prefix = raw[:32]
if ef.key:
key = re.sub(r'[^A-Fa-f0-9]', '', ef.key)
key = bytearray(reversed(tuple(int(key[i:i+2], 16) for i in xrange(0, len(key), 2))))
prefix = bytearray(prefix)
prefix = bytes(bytearray(prefix[i]^key[i % len(key)] for i in xrange(len(prefix))))
if not is_truetype_font(prefix):
return None
ext = 'otf' if prefix.startswith(b'OTTO') else 'ttf'
fname = ascii_filename('%s - %s.%s' % (name, variant, ext))
with open(os.path.join(dest_dir, fname), 'wb') as dest:
dest.write(prefix)
dest.write(raw[32:])
return fname

View File

@ -0,0 +1,62 @@
#!/usr/bin/env python
# vim:fileencoding=utf-8
from __future__ import (unicode_literals, division, absolute_import,
print_function)
__license__ = 'GPL v3'
__copyright__ = '2013, Kovid Goyal <kovid at kovidgoyal.net>'
from collections import OrderedDict
from calibre.ebooks.docx.names import get, XPath, descendants
class Note(object):
def __init__(self, parent):
self.type = get(parent, 'w:type', 'normal')
self.parent = parent
def __iter__(self):
for p in descendants(self.parent, 'w:p'):
yield p
class Footnotes(object):
def __init__(self):
self.footnotes = {}
self.endnotes = {}
self.counter = 0
self.notes = OrderedDict()
def __call__(self, footnotes, endnotes):
if footnotes is not None:
for footnote in XPath('./w:footnote[@w:id]')(footnotes):
fid = get(footnote, 'w:id')
if fid:
self.footnotes[fid] = Note(footnote)
if endnotes is not None:
for endnote in XPath('./w:endnote[@w:id]')(endnotes):
fid = get(endnote, 'w:id')
if fid:
self.endnotes[fid] = Note(endnote)
def get_ref(self, ref):
fid = get(ref, 'w:id')
notes = self.footnotes if ref.tag.endswith('}footnoteReference') else self.endnotes
note = notes.get(fid, None)
if note is not None and note.type == 'normal':
self.counter += 1
anchor = 'note_%d' % self.counter
self.notes[anchor] = (type('')(self.counter), note)
return anchor, type('')(self.counter)
return None, None
def __iter__(self):
for anchor, (counter, note) in self.notes.iteritems():
yield anchor, counter, note
@property
def has_notes(self):
return bool(self.notes)

View File

@ -0,0 +1,205 @@
#!/usr/bin/env python
# vim:fileencoding=utf-8
from __future__ import (unicode_literals, division, absolute_import,
print_function)
__license__ = 'GPL v3'
__copyright__ = '2013, Kovid Goyal <kovid at kovidgoyal.net>'
import os
from lxml.html.builder import IMG
from calibre.ebooks.docx.names import XPath, get, barename
from calibre.utils.filenames import ascii_filename
from calibre.utils.imghdr import what
def emu_to_pt(x):
return x / 12700
def get_image_properties(parent):
width = height = None
for extent in XPath('./wp:extent')(parent):
try:
width = emu_to_pt(int(extent.get('cx')))
except (TypeError, ValueError):
pass
try:
height = emu_to_pt(int(extent.get('cy')))
except (TypeError, ValueError):
pass
ans = {}
if width is not None:
ans['width'] = '%.3gpt' % width
if height is not None:
ans['height'] = '%.3gpt' % height
alt = None
for docPr in XPath('./wp:docPr')(parent):
x = docPr.get('descr', None)
if x:
alt = x
if docPr.get('hidden', None) in {'true', 'on', '1'}:
ans['display'] = 'none'
return ans, alt
def get_image_margins(elem):
ans = {}
for w, css in {'L':'left', 'T':'top', 'R':'right', 'B':'bottom'}.iteritems():
val = elem.get('dist%s' % w, None)
if val is not None:
try:
val = emu_to_pt(val)
except (TypeError, ValueError):
continue
ans['padding-%s' % css] = '%.3gpt' % val
return ans
def get_hpos(anchor, page_width):
for ph in XPath('./wp:positionH')(anchor):
rp = ph.get('relativeFrom', None)
if rp == 'leftMargin':
return 0
if rp == 'rightMargin':
return 1
for align in XPath('./wp:align')(ph):
al = align.text
if al == 'left':
return 0
if al == 'center':
return 0.5
if al == 'right':
return 1
for po in XPath('./wp:posOffset')(ph):
try:
pos = emu_to_pt(int(po.text))
except (TypeError, ValueError):
continue
return pos/page_width
for sp in XPath('./wp:simplePos')(anchor):
try:
x = emu_to_pt(sp.get('x', None))
except (TypeError, ValueError):
continue
return x/page_width
return 0
class Images(object):
def __init__(self):
self.rid_map = {}
self.used = {}
self.names = set()
self.all_images = set()
def __call__(self, relationships_by_id):
self.rid_map = relationships_by_id
def generate_filename(self, rid, base=None):
if rid in self.used:
return self.used[rid]
raw = self.docx.read(self.rid_map[rid])
base = base or ascii_filename(self.rid_map[rid].rpartition('/')[-1]).replace(' ', '_')
ext = what(None, raw) or base.rpartition('.')[-1] or 'jpeg'
base = base.rpartition('.')[0] + '.' + ext
exists = frozenset(self.used.itervalues())
c = 1
while base in exists:
n, e = base.rpartition('.')[0::2]
base = '%s-%d.%s' % (n, c, e)
c += 1
self.used[rid] = base
with open(os.path.join(self.dest_dir, base), 'wb') as f:
f.write(raw)
self.all_images.add('images/' + base)
return base
def pic_to_img(self, pic, alt=None):
name = None
for pr in XPath('descendant::pic:cNvPr')(pic):
name = pr.get('name', None)
if name:
name = ascii_filename(name).replace(' ', '_')
alt = pr.get('descr', None)
for a in XPath('descendant::a:blip[@r:embed]')(pic):
rid = get(a, 'r:embed')
if rid in self.rid_map:
src = self.generate_filename(rid, name)
img = IMG(src='images/%s' % src)
if alt:
img(alt=alt)
return img
def drawing_to_html(self, drawing, page):
# First process the inline pictures
for inline in XPath('./wp:inline')(drawing):
style, alt = get_image_properties(inline)
for pic in XPath('descendant::pic:pic')(inline):
ans = self.pic_to_img(pic, alt)
if ans is not None:
if style:
ans.set('style', '; '.join('%s: %s' % (k, v) for k, v in style.iteritems()))
yield ans
# Now process the floats
for anchor in XPath('./wp:anchor')(drawing):
style, alt = get_image_properties(anchor)
self.get_float_properties(anchor, style, page)
for pic in XPath('descendant::pic:pic')(anchor):
ans = self.pic_to_img(pic, alt)
if ans is not None:
if style:
ans.set('style', '; '.join('%s: %s' % (k, v) for k, v in style.iteritems()))
yield ans
def get_float_properties(self, anchor, style, page):
if 'display' not in style:
style['display'] = 'block'
padding = get_image_margins(anchor)
width = float(style.get('width', '100pt')[:-2])
page_width = page.width - page.margin_left - page.margin_right
hpos = get_hpos(anchor, page_width) + width/(2*page_width)
wrap_elem = None
dofloat = False
for child in reversed(anchor):
bt = barename(child.tag)
if bt in {'wrapNone', 'wrapSquare', 'wrapThrough', 'wrapTight', 'wrapTopAndBottom'}:
wrap_elem = child
dofloat = bt not in {'wrapNone', 'wrapTopAndBottom'}
break
if wrap_elem is not None:
padding.update(get_image_margins(wrap_elem))
wt = wrap_elem.get('wrapText', None)
hpos = 0 if wt == 'right' else 1 if wt == 'left' else hpos
if dofloat:
style['float'] = 'left' if hpos < 0.65 else 'right'
else:
ml, mr = (None, None) if hpos < 0.34 else ('auto', None) if hpos > 0.65 else ('auto', 'auto')
if ml is not None:
style['margin-left'] = ml
if mr is not None:
style['margin-right'] = mr
style.update(padding)
def to_html(self, elem, page, docx, dest_dir):
dest = os.path.join(dest_dir, 'images')
if not os.path.exists(dest):
os.mkdir(dest)
self.dest_dir, self.docx = dest, docx
if elem.tag.endswith('}drawing'):
for tag in self.drawing_to_html(elem, page):
yield tag
# TODO: Handle w:pict

View File

@ -0,0 +1,233 @@
#!/usr/bin/env python
# vim:fileencoding=utf-8
from __future__ import (unicode_literals, division, absolute_import,
print_function)
__license__ = 'GPL v3'
__copyright__ = '2013, Kovid Goyal <kovid at kovidgoyal.net>'
lcid = {
1078: 'af', # Afrikaans - South Africa
1052: 'sq', # Albanian - Albania
1118: 'am', # Amharic - Ethiopia
1025: 'ar', # Arabic - Saudi Arabia
5121: 'ar', # Arabic - Algeria
15361: 'ar', # Arabic - Bahrain
3073: 'ar', # Arabic - Egypt
2049: 'ar', # Arabic - Iraq
11265: 'ar', # Arabic - Jordan
13313: 'ar', # Arabic - Kuwait
12289: 'ar', # Arabic - Lebanon
4097: 'ar', # Arabic - Libya
6145: 'ar', # Arabic - Morocco
8193: 'ar', # Arabic - Oman
16385: 'ar', # Arabic - Qatar
10241: 'ar', # Arabic - Syria
7169: 'ar', # Arabic - Tunisia
14337: 'ar', # Arabic - U.A.E.
9217: 'ar', # Arabic - Yemen
1067: 'hy', # Armenian - Armenia
1101: 'as', # Assamese
2092: 'az', # Azeri (Cyrillic)
1068: 'az', # Azeri (Latin)
1069: 'eu', # Basque
1059: 'be', # Belarusian
1093: 'bn', # Bengali (India)
2117: 'bn', # Bengali (Bangladesh)
5146: 'bs', # Bosnian (Bosnia/Herzegovina)
1026: 'bg', # Bulgarian
1109: 'my', # Burmese
1027: 'ca', # Catalan
1116: 'chr', # Cherokee - United States
2052: 'zh', # Chinese - People's Republic of China
4100: 'zh', # Chinese - Singapore
1028: 'zh', # Chinese - Taiwan
3076: 'zh', # Chinese - Hong Kong SAR
5124: 'zh', # Chinese - Macao SAR
1050: 'hr', # Croatian
4122: 'hr', # Croatian (Bosnia/Herzegovina)
1029: 'cs', # Czech
1030: 'da', # Danish
1125: 'dv', # Divehi
1043: 'nl', # Dutch - Netherlands
2067: 'nl', # Dutch - Belgium
1126: 'bin', # Edo
1033: 'en', # English - United States
2057: 'en', # English - United Kingdom
3081: 'en', # English - Australia
10249: 'en', # English - Belize
4105: 'en', # English - Canada
9225: 'en', # English - Caribbean
15369: 'en', # English - Hong Kong SAR
16393: 'en', # English - India
14345: 'en', # English - Indonesia
6153: 'en', # English - Ireland
8201: 'en', # English - Jamaica
17417: 'en', # English - Malaysia
5129: 'en', # English - New Zealand
13321: 'en', # English - Philippines
18441: 'en', # English - Singapore
7177: 'en', # English - South Africa
11273: 'en', # English - Trinidad
12297: 'en', # English - Zimbabwe
1061: 'et', # Estonian
1080: 'fo', # Faroese
1065: None, # TODO: Farsi
1124: 'fil', # Filipino
1035: 'fi', # Finnish
1036: 'fr', # French - France
2060: 'fr', # French - Belgium
11276: 'fr', # French - Cameroon
3084: 'fr', # French - Canada
9228: 'fr', # French - Democratic Rep. of Congo
12300: 'fr', # French - Cote d'Ivoire
15372: 'fr', # French - Haiti
5132: 'fr', # French - Luxembourg
13324: 'fr', # French - Mali
6156: 'fr', # French - Monaco
14348: 'fr', # French - Morocco
58380: 'fr', # French - North Africa
8204: 'fr', # French - Reunion
10252: 'fr', # French - Senegal
4108: 'fr', # French - Switzerland
7180: 'fr', # French - West Indies
1122: 'fy', # Frisian - Netherlands
1127: None, # TODO: Fulfulde - Nigeria
1071: 'mk', # FYRO Macedonian
2108: 'ga', # Gaelic (Ireland)
1084: 'gd', # Gaelic (Scotland)
1110: 'gl', # Galician
1079: 'ka', # Georgian
1031: 'de', # German - Germany
3079: 'de', # German - Austria
5127: 'de', # German - Liechtenstein
4103: 'de', # German - Luxembourg
2055: 'de', # German - Switzerland
1032: 'el', # Greek
1140: 'gn', # Guarani - Paraguay
1095: 'gu', # Gujarati
1128: 'ha', # Hausa - Nigeria
1141: 'haw', # Hawaiian - United States
1037: 'he', # Hebrew
1081: 'hi', # Hindi
1038: 'hu', # Hungarian
1129: None, # TODO: Ibibio - Nigeria
1039: 'is', # Icelandic
1136: 'ig', # Igbo - Nigeria
1057: 'id', # Indonesian
1117: 'iu', # Inuktitut
1040: 'it', # Italian - Italy
2064: 'it', # Italian - Switzerland
1041: 'ja', # Japanese
1099: 'kn', # Kannada
1137: 'kr', # Kanuri - Nigeria
2144: 'ks', # Kashmiri
1120: 'ks', # Kashmiri (Arabic)
1087: 'kk', # Kazakh
1107: 'km', # Khmer
1111: 'kok', # Konkani
1042: 'ko', # Korean
1088: 'ky', # Kyrgyz (Cyrillic)
1108: 'lo', # Lao
1142: 'la', # Latin
1062: 'lv', # Latvian
1063: 'lt', # Lithuanian
1086: 'ms', # Malay - Malaysia
2110: 'ms', # Malay - Brunei Darussalam
1100: 'ml', # Malayalam
1082: 'mt', # Maltese
1112: 'mni', # Manipuri
1153: 'mi', # Maori - New Zealand
1102: 'mr', # Marathi
1104: 'mn', # Mongolian (Cyrillic)
2128: 'mn', # Mongolian (Mongolian)
1121: 'ne', # Nepali
2145: 'ne', # Nepali - India
1044: 'no', # Norwegian (Bokmᅢᆬl)
2068: 'no', # Norwegian (Nynorsk)
1096: 'or', # Oriya
1138: 'om', # Oromo
1145: 'pap', # Papiamentu
1123: 'ps', # Pashto
1045: 'pl', # Polish
1046: 'pt', # Portuguese - Brazil
2070: 'pt', # Portuguese - Portugal
1094: 'pa', # Punjabi
2118: 'pa', # Punjabi (Pakistan)
1131: 'qu', # Quecha - Bolivia
2155: 'qu', # Quecha - Ecuador
3179: 'qu', # Quecha - Peru
1047: 'rm', # Rhaeto-Romanic
1048: 'ro', # Romanian
2072: 'ro', # Romanian - Moldava
1049: 'ru', # Russian
2073: 'ru', # Russian - Moldava
1083: 'se', # Sami (Lappish)
1103: 'sa', # Sanskrit
1132: 'nso', # Sepedi
3098: 'sr', # Serbian (Cyrillic)
2074: 'sr', # Serbian (Latin)
1113: 'sd', # Sindhi - India
2137: 'sd', # Sindhi - Pakistan
1115: 'si', # Sinhalese - Sri Lanka
1051: 'sk', # Slovak
1060: 'sl', # Slovenian
1143: 'so', # Somali
1070: 'wen', # Sorbian
3082: 'es', # Spanish - Spain (Modern Sort)
1034: 'es', # Spanish - Spain (Traditional Sort)
11274: 'es', # Spanish - Argentina
16394: 'es', # Spanish - Bolivia
13322: 'es', # Spanish - Chile
9226: 'es', # Spanish - Colombia
5130: 'es', # Spanish - Costa Rica
7178: 'es', # Spanish - Dominican Republic
12298: 'es', # Spanish - Ecuador
17418: 'es', # Spanish - El Salvador
4106: 'es', # Spanish - Guatemala
18442: 'es', # Spanish - Honduras
58378: 'es', # Spanish - Latin America
2058: 'es', # Spanish - Mexico
19466: 'es', # Spanish - Nicaragua
6154: 'es', # Spanish - Panama
15370: 'es', # Spanish - Paraguay
10250: 'es', # Spanish - Peru
20490: 'es', # Spanish - Puerto Rico
21514: 'es', # Spanish - United States
14346: 'es', # Spanish - Uruguay
8202: 'es', # Spanish - Venezuela
1072: None, # TODO: Sutu
1089: 'sw', # Swahili
1053: 'sv', # Swedish
2077: 'sv', # Swedish - Finland
1114: 'syr', # Syriac
1064: 'tg', # Tajik
1119: None, # TODO: Tamazight (Arabic)
2143: None, # TODO: Tamazight (Latin)
1097: 'ta', # Tamil
1092: 'tt', # Tatar
1098: 'te', # Telugu
1054: 'th', # Thai
2129: 'bo', # Tibetan - Bhutan
1105: 'bo', # Tibetan - People's Republic of China
2163: 'ti', # Tigrigna - Eritrea
1139: 'ti', # Tigrigna - Ethiopia
1073: 'ts', # Tsonga
1074: 'tn', # Tswana
1055: 'tr', # Turkish
1090: 'tk', # Turkmen
1152: 'ug', # Uighur - China
1058: 'uk', # Ukrainian
1056: 'ur', # Urdu
2080: 'ur', # Urdu - India
2115: 'uz', # Uzbek (Cyrillic)
1091: 'uz', # Uzbek (Latin)
1075: 've', # Venda
1066: 'vi', # Vietnamese
1106: 'cy', # Welsh
1076: 'xh', # Xhosa
1144: 'ii', # Yi
1085: 'yi', # Yiddish
1130: 'yo', # Yoruba
1077: 'zu' # Zulu
}

View File

@ -6,12 +6,23 @@ from __future__ import (unicode_literals, division, absolute_import,
__license__ = 'GPL v3' __license__ = 'GPL v3'
__copyright__ = '2013, Kovid Goyal <kovid at kovidgoyal.net>' __copyright__ = '2013, Kovid Goyal <kovid at kovidgoyal.net>'
import re
from future_builtins import map
from lxml.etree import XPath as X from lxml.etree import XPath as X
DOCUMENT = 'http://schemas.openxmlformats.org/officeDocument/2006/relationships/officeDocument' from calibre.utils.filenames import ascii_text
DOCPROPS = 'http://schemas.openxmlformats.org/package/2006/relationships/metadata/core-properties'
APPPROPS = 'http://schemas.openxmlformats.org/officeDocument/2006/relationships/extended-properties' DOCUMENT = 'http://schemas.openxmlformats.org/officeDocument/2006/relationships/officeDocument'
STYLES = 'http://schemas.openxmlformats.org/officeDocument/2006/relationships/styles' DOCPROPS = 'http://schemas.openxmlformats.org/package/2006/relationships/metadata/core-properties'
APPPROPS = 'http://schemas.openxmlformats.org/officeDocument/2006/relationships/extended-properties'
STYLES = 'http://schemas.openxmlformats.org/officeDocument/2006/relationships/styles'
NUMBERING = 'http://schemas.openxmlformats.org/officeDocument/2006/relationships/numbering'
FONTS = 'http://schemas.openxmlformats.org/officeDocument/2006/relationships/fontTable'
IMAGES = 'http://schemas.openxmlformats.org/officeDocument/2006/relationships/image'
LINKS = 'http://schemas.openxmlformats.org/officeDocument/2006/relationships/hyperlink'
FOOTNOTES = 'http://schemas.openxmlformats.org/officeDocument/2006/relationships/footnotes'
ENDNOTES = 'http://schemas.openxmlformats.org/officeDocument/2006/relationships/endnotes'
namespaces = { namespaces = {
'mo': 'http://schemas.microsoft.com/office/mac/office/2008/main', 'mo': 'http://schemas.microsoft.com/office/mac/office/2008/main',
@ -44,8 +55,13 @@ namespaces = {
'dcterms': 'http://purl.org/dc/terms/' 'dcterms': 'http://purl.org/dc/terms/'
} }
xpath_cache = {}
def XPath(expr): def XPath(expr):
return X(expr, namespaces=namespaces) ans = xpath_cache.get(expr, None)
if ans is None:
xpath_cache[expr] = ans = X(expr, namespaces=namespaces)
return ans
def is_tag(x, q): def is_tag(x, q):
tag = getattr(x, 'tag', x) tag = getattr(x, 'tag', x)
@ -58,7 +74,32 @@ def barename(x):
def XML(x): def XML(x):
return '{%s}%s' % (namespaces['xml'], x) return '{%s}%s' % (namespaces['xml'], x)
def get(x, attr, default=None): def expand(name):
ns, name = attr.partition(':')[0::2] ns, tag = name.partition(':')[0::2]
return x.attrib.get('{%s}%s' % (namespaces[ns], name), default) if ns:
tag = '{%s}%s' % (namespaces[ns], tag)
return tag
def get(x, attr, default=None):
return x.attrib.get(expand(attr), default)
def ancestor(elem, name):
tag = expand(name)
while elem is not None:
elem = elem.getparent()
if getattr(elem, 'tag', None) == tag:
return elem
def generate_anchor(name, existing):
x = y = 'id_' + re.sub(r'[^0-9a-zA-Z_]', '', ascii_text(name)).lstrip('_')
c = 1
while y in existing:
y = '%s_%d' % (x, c)
c += 1
return y
def children(elem, *args):
return elem.iterchildren(*map(expand, args))
def descendants(elem, *args):
return elem.iterdescendants(*map(expand, args))

View File

@ -0,0 +1,300 @@
#!/usr/bin/env python
# vim:fileencoding=utf-8
from __future__ import (unicode_literals, division, absolute_import,
print_function)
__license__ = 'GPL v3'
__copyright__ = '2013, Kovid Goyal <kovid at kovidgoyal.net>'
import re
from collections import Counter
from lxml.html.builder import OL, UL, SPAN
from calibre.ebooks.docx.block_styles import ParagraphStyle
from calibre.ebooks.docx.char_styles import RunStyle
from calibre.ebooks.docx.names import XPath, get
STYLE_MAP = {
'aiueo': 'hiragana',
'aiueoFullWidth': 'hiragana',
'hebrew1': 'hebrew',
'iroha': 'katakana-iroha',
'irohaFullWidth': 'katakana-iroha',
'lowerLetter': 'lower-alpha',
'lowerRoman': 'lower-roman',
'none': 'none',
'upperLetter': 'upper-alpha',
'upperRoman': 'upper-roman',
'chineseCounting': 'cjk-ideographic',
'decimalZero': 'decimal-leading-zero',
}
class Level(object):
def __init__(self, lvl=None):
self.restart = None
self.start = 0
self.fmt = 'decimal'
self.para_link = None
self.paragraph_style = self.character_style = None
self.is_numbered = False
self.num_template = None
if lvl is not None:
self.read_from_xml(lvl)
def copy(self):
ans = Level()
for x in ('restart', 'start', 'fmt', 'para_link', 'paragraph_style', 'character_style', 'is_numbered', 'num_template'):
setattr(ans, x, getattr(self, x))
return ans
def format_template(self, counter, ilvl):
def sub(m):
x = int(m.group(1)) - 1
if x > ilvl or x not in counter:
return ''
return '%d' % (counter[x] - (0 if x == ilvl else 1))
return re.sub(r'%(\d+)', sub, self.num_template).rstrip() + '\xa0'
def read_from_xml(self, lvl, override=False):
for lr in XPath('./w:lvlRestart[@w:val]')(lvl):
try:
self.restart = int(get(lr, 'w:val'))
except (TypeError, ValueError):
pass
for lr in XPath('./w:start[@w:val]')(lvl):
try:
self.start = int(get(lr, 'w:val'))
except (TypeError, ValueError):
pass
lt = None
for lr in XPath('./w:lvlText[@w:val]')(lvl):
lt = get(lr, 'w:val')
for lr in XPath('./w:numFmt[@w:val]')(lvl):
val = get(lr, 'w:val')
if val == 'bullet':
self.is_numbered = False
self.fmt = {'\uf0a7':'square', 'o':'circle'}.get(lt, 'disc')
else:
self.is_numbered = True
self.fmt = STYLE_MAP.get(val, 'decimal')
if lt and re.match(r'%\d+\.$', lt) is None:
self.num_template = lt
for lr in XPath('./w:pStyle[@w:val]')(lvl):
self.para_link = get(lr, 'w:val')
for pPr in XPath('./w:pPr')(lvl):
ps = ParagraphStyle(pPr)
if self.paragraph_style is None:
self.paragraph_style = ps
else:
self.paragraph_style.update(ps)
for rPr in XPath('./w:rPr')(lvl):
ps = RunStyle(rPr)
if self.character_style is None:
self.character_style = ps
else:
self.character_style.update(ps)
class NumberingDefinition(object):
def __init__(self, parent=None):
self.levels = {}
if parent is not None:
for lvl in XPath('./w:lvl')(parent):
try:
ilvl = int(get(lvl, 'w:ilvl', 0))
except (TypeError, ValueError):
ilvl = 0
self.levels[ilvl] = Level(lvl)
def copy(self):
ans = NumberingDefinition()
for l, lvl in self.levels.iteritems():
ans.levels[l] = lvl.copy()
return ans
class Numbering(object):
def __init__(self):
self.definitions = {}
self.instances = {}
self.counters = {}
def __call__(self, root, styles):
' Read all numbering style definitions '
lazy_load = {}
for an in XPath('./w:abstractNum[@w:abstractNumId]')(root):
an_id = get(an, 'w:abstractNumId')
nsl = XPath('./w:numStyleLink[@w:val]')(an)
if nsl:
lazy_load[an_id] = get(nsl[0], 'w:val')
else:
nd = NumberingDefinition(an)
self.definitions[an_id] = nd
def create_instance(n, definition):
nd = definition.copy()
for lo in XPath('./w:lvlOverride')(n):
ilvl = get(lo, 'w:ilvl')
for lvl in XPath('./w:lvl')(lo)[:1]:
nilvl = get(lvl, 'w:ilvl')
ilvl = nilvl if ilvl is None else ilvl
alvl = nd.levels.get(ilvl, None)
if alvl is None:
alvl = Level()
alvl.read_from_xml(lvl, override=True)
return nd
next_pass = {}
for n in XPath('./w:num[@w:numId]')(root):
an_id = None
num_id = get(n, 'w:numId')
for an in XPath('./w:abstractNumId[@w:val]')(n):
an_id = get(an, 'w:val')
d = self.definitions.get(an_id, None)
if d is None:
next_pass[num_id] = (an_id, n)
continue
self.instances[num_id] = create_instance(n, d)
numbering_links = styles.numbering_style_links
for an_id, style_link in lazy_load.iteritems():
num_id = numbering_links[style_link]
self.definitions[an_id] = self.instances[num_id].copy()
for num_id, (an_id, n) in next_pass.iteritems():
d = self.definitions.get(an_id, None)
if d is not None:
self.instances[num_id] = create_instance(n, d)
for num_id, d in self.instances.iteritems():
self.counters[num_id] = Counter({lvl:d.levels[lvl].start for lvl in d.levels})
def get_pstyle(self, num_id, style_id):
d = self.instances.get(num_id, None)
if d is not None:
for ilvl, lvl in d.levels.iteritems():
if lvl.para_link == style_id:
return ilvl
def get_para_style(self, num_id, lvl):
d = self.instances.get(num_id, None)
if d is not None:
lvl = d.levels.get(lvl, None)
return getattr(lvl, 'paragraph_style', None)
def update_counter(self, counter, levelnum, levels):
counter[levelnum] += 1
for ilvl, lvl in levels.iteritems():
restart = lvl.restart
if (restart is None and ilvl == levelnum + 1) or restart == levelnum + 1:
counter[ilvl] = lvl.start
def apply_markup(self, items, body, styles, object_map):
for p, num_id, ilvl in items:
d = self.instances.get(num_id, None)
if d is not None:
lvl = d.levels.get(ilvl, None)
if lvl is not None:
counter = self.counters[num_id]
p.tag = 'li'
p.set('value', '%s' % counter[ilvl])
p.set('list-lvl', str(ilvl))
p.set('list-id', num_id)
if lvl.num_template is not None:
val = lvl.format_template(counter, ilvl)
p.set('list-template', val)
self.update_counter(counter, ilvl, d.levels)
templates = {}
def commit(current_run):
if not current_run:
return
start = current_run[0]
parent = start.getparent()
idx = parent.index(start)
d = self.instances[start.get('list-id')]
ilvl = int(start.get('list-lvl'))
lvl = d.levels[ilvl]
lvlid = start.get('list-id') + start.get('list-lvl')
wrap = (OL if lvl.is_numbered else UL)('\n\t')
has_template = 'list-template' in start.attrib
if has_template:
wrap.set('lvlid', lvlid)
else:
wrap.set('class', styles.register({'list-style-type': lvl.fmt}, 'list'))
parent.insert(idx, wrap)
last_val = None
for child in current_run:
wrap.append(child)
child.tail = '\n\t'
if has_template:
span = SPAN()
span.text = child.text
child.text = None
for gc in child:
span.append(gc)
child.append(span)
span = SPAN(child.get('list-template'))
last = templates.get(lvlid, '')
if span.text and len(span.text) > len(last):
templates[lvlid] = span.text
child.insert(0, span)
for attr in ('list-lvl', 'list-id', 'list-template'):
child.attrib.pop(attr, None)
val = int(child.get('value'))
if last_val == val - 1 or wrap.tag == 'ul':
child.attrib.pop('value')
last_val = val
current_run[-1].tail = '\n'
del current_run[:]
parents = set()
for child in body.iterdescendants('li'):
parents.add(child.getparent())
for parent in parents:
current_run = []
for child in parent:
if child.tag == 'li':
if current_run:
last = current_run[-1]
if (last.get('list-id') , last.get('list-lvl')) != (child.get('list-id'), child.get('list-lvl')):
commit(current_run)
current_run.append(child)
else:
commit(current_run)
commit(current_run)
for wrap in body.xpath('//ol[@lvlid]'):
lvlid = wrap.attrib.pop('lvlid')
wrap.tag = 'div'
text = ''
maxtext = templates.get(lvlid, '').replace('.', '')[:-1]
for li in wrap.iterchildren('li'):
t = li[0].text
if t and len(t) > len(text):
text = t
for i, li in enumerate(wrap.iterchildren('li')):
li.tag = 'div'
li.attrib.pop('value', None)
li.set('style', 'display:table-row')
obj = object_map[li]
bs = styles.para_cache[obj]
if i == 0:
m = len(maxtext) # Move the table left to simulate the behavior of a list (number is to the left of text margin)
wrap.set('style', 'display:table; margin-left: -%dem; padding-left: %s' % (m, bs.css.get('margin-left', 0)))
bs.css.pop('margin-left', None)
for child in li:
child.set('style', 'display:table-cell')

View File

@ -6,205 +6,56 @@ from __future__ import (unicode_literals, division, absolute_import,
__license__ = 'GPL v3' __license__ = 'GPL v3'
__copyright__ = '2013, Kovid Goyal <kovid at kovidgoyal.net>' __copyright__ = '2013, Kovid Goyal <kovid at kovidgoyal.net>'
from collections import OrderedDict import textwrap
from collections import OrderedDict, Counter
from calibre.ebooks.docx.block_styles import ParagraphStyle, inherit
from calibre.ebooks.docx.char_styles import RunStyle
from calibre.ebooks.docx.names import XPath, get from calibre.ebooks.docx.names import XPath, get
class Inherit: class PageProperties(object):
pass
inherit = Inherit()
def binary_property(parent, name): '''
vals = XPath('./w:%s') Class representing page level properties (page size/margins) read from
if not vals: sectPr elements.
return inherit '''
val = get(vals[0], 'w:val', 'on')
return True if val in {'on', '1', 'true'} else False
def simple_color(col): def __init__(self, elems=()):
if not col or col == 'auto' or len(col) != 6: self.width = self.height = 595.28, 841.89 # pts, A4
return 'black' self.margin_left = self.margin_right = 72 # pts
return '#'+col for sectPr in elems:
for pgSz in XPath('./w:pgSz')(sectPr):
def simple_float(val, mult=1.0): w, h = get(pgSz, 'w:w'), get(pgSz, 'w:h')
try:
return float(val) * mult
except (ValueError, TypeError, AttributeError, KeyError):
return None
# Block styles {{{
LINE_STYLES = { # {{{
'basicBlackDashes': 'dashed',
'basicBlackDots': 'dotted',
'basicBlackSquares': 'dashed',
'basicThinLines': 'solid',
'dashDotStroked': 'groove',
'dashed': 'dashed',
'dashSmallGap': 'dashed',
'dotDash': 'dashed',
'dotDotDash': 'dashed',
'dotted': 'dotted',
'double': 'double',
'inset': 'inset',
'nil': 'none',
'none': 'none',
'outset': 'outset',
'single': 'solid',
'thick': 'solid',
'thickThinLargeGap': 'double',
'thickThinMediumGap': 'double',
'thickThinSmallGap' : 'double',
'thinThickLargeGap': 'double',
'thinThickMediumGap': 'double',
'thinThickSmallGap': 'double',
'thinThickThinLargeGap': 'double',
'thinThickThinMediumGap': 'double',
'thinThickThinSmallGap': 'double',
'threeDEmboss': 'ridge',
'threeDEngrave': 'groove',
'triple': 'double',
} # }}}
def read_border(border, dest):
all_attrs = set()
for edge in ('left', 'top', 'right', 'bottom'):
vals = {'padding_%s':inherit, 'border_%s_width':inherit,
'border_%s_style':inherit, 'border_%s_color':inherit}
all_attrs |= {key % edge for key in vals}
for elem in XPath('./w:%s' % edge):
color = get(elem, 'w:color')
if color is not None:
vals['border_%s_color'] = simple_color(color)
style = get(elem, 'w:val')
if style is not None:
vals['border_%s_style'] = LINE_STYLES.get(style, 'solid')
space = get(elem, 'w:space')
if space is not None:
try: try:
vals['padding_%s'] = float(space) self.width = int(w)/20
except (ValueError, TypeError): except (ValueError, TypeError):
pass pass
sz = get(elem, 'w:space')
if sz is not None:
# we dont care about art borders (they are only used for page borders)
try: try:
vals['border_%s_width'] = min(96, max(2, float(sz))) * 8 self.height = int(h)/20
except (ValueError, TypeError):
pass
for pgMar in XPath('./w:pgMar')(sectPr):
l, r = get(pgMar, 'w:left'), get(pgMar, 'w:right')
try:
self.margin_left = int(l)/20
except (ValueError, TypeError):
pass
try:
self.margin_right = int(r)/20
except (ValueError, TypeError): except (ValueError, TypeError):
pass pass
for key, val in vals.iteritems():
setattr(dest, key % edge, val)
return all_attrs
def read_indent(parent, dest):
padding_left = padding_right = text_indent = inherit
for indent in XPath('./w:ind')(parent):
l, lc = get(indent, 'w:left'), get(indent, 'w:leftChars')
pl = simple_float(lc, 0.01) if lc is not None else simple_float(l, 0.05) if l is not None else None
if pl is not None:
padding_left = '%.3f%s' % (pl, 'em' if lc is not None else 'pt')
r, rc = get(indent, 'w:right'), get(indent, 'w:rightChars')
pr = simple_float(rc, 0.01) if rc is not None else simple_float(r, 0.05) if r is not None else None
if pr is not None:
padding_right = '%.3f%s' % (pr, 'em' if rc is not None else 'pt')
h, hc = get(indent, 'w:hanging'), get(indent, 'w:hangingChars')
fl, flc = get(indent, 'w:firstLine'), get(indent, 'w:firstLineChars')
ti = (simple_float(hc, 0.01) if hc is not None else simple_float(h, 0.05) if h is not None else
simple_float(flc, 0.01) if flc is not None else simple_float(fl, 0.05) if fl is not None else None)
if ti is not None:
text_indent = '%.3f' % (ti, 'em' if hc is not None or (h is None and flc is not None) else 'pt')
setattr(dest, 'padding_left', padding_left)
setattr(dest, 'padding_right', padding_right)
setattr(dest, 'text_indent', text_indent)
return {'padding_left', 'padding_right', 'text_indent'}
def read_justification(parent, dest):
ans = inherit
for jc in XPath('./w:jc[@w:val]')(parent):
val = get(jc, 'w:val')
if not val:
continue
if val in {'both', 'distribute'} or 'thai' in val or 'kashida' in val:
ans = 'justify'
if val in {'left', 'center', 'right',}:
ans = val
setattr(dest, 'text_align', ans)
return {'text_align'}
def read_spacing(parent, dest):
padding_top = padding_bottom = line_height = inherit
for s in XPath('./w:spacing')(parent):
a, al, aa = get(s, 'w:after'), get(s, 'w:afterLines'), get(s, 'w:afterAutospacing')
pb = None if aa in {'on', '1', 'true'} else simple_float(al, 0.02) if al is not None else simple_float(a, 0.05) if a is not None else None
if pb is not None:
padding_bottom = '%.3f%s' % (pb, 'ex' if al is not None else 'pt')
b, bl, bb = get(s, 'w:before'), get(s, 'w:beforeLines'), get(s, 'w:beforeAutospacing')
pt = None if bb in {'on', '1', 'true'} else simple_float(bl, 0.02) if bl is not None else simple_float(b, 0.05) if b is not None else None
if pt is not None:
padding_top = '%.3f%s' % (pt, 'ex' if bl is not None else 'pt')
l, lr = get(s, 'w:line'), get(s, 'w:lineRule', 'auto')
if l is not None:
lh = simple_float(l, 0.05) if lr in {'exactly', 'atLeast'} else simple_float(l, 1/240.0)
line_height = '%.3f%s' % (lh, 'pt' if lr in {'exactly', 'atLeast'} else '')
setattr(dest, 'padding_top', padding_top)
setattr(dest, 'padding_bottom', padding_bottom)
setattr(dest, 'line_height', line_height)
return {'padding_top', 'padding_bottom', 'line_height'}
def read_direction(parent, dest):
ans = inherit
for jc in XPath('./w:textFlow[@w:val]')(parent):
val = get(jc, 'w:val')
if not val:
continue
if 'rl' in val.lower():
ans = 'rtl'
setattr(dest, 'direction', ans)
return {'direction'}
class ParagraphStyle(object):
border_path = XPath('./w:pBdr')
def __init__(self, pPr):
self.all_properties = set()
for p in (
'adjustRightInd', 'autoSpaceDE', 'autoSpaceDN',
'bidi', 'contextualSpacing', 'keepLines', 'keepNext',
'mirrorIndents', 'pageBreakBefore', 'snapToGrid',
'suppressLineNumbers', 'suppressOverlap', 'topLinePunct',
'widowControl', 'wordWrap',
):
self.all_properties.add(p)
setattr(p, binary_property(pPr, p))
for border in self.border_path(pPr):
self.all_properties |= read_border(border, self)
self.all_properties |= read_indent(pPr, self)
self.all_properties |= read_justification(pPr, self)
self.all_properties |= read_spacing(pPr, self)
self.all_properties |= read_direction(pPr, self)
# TODO: numPr and outlineLvl
# }}}
class Style(object): class Style(object):
'''
Class representing a <w:style> element. Can contain block, character, etc. styles.
'''
name_path = XPath('./w:name[@w:val]') name_path = XPath('./w:name[@w:val]')
based_on_path = XPath('./w:basedOn[@w:val]') based_on_path = XPath('./w:basedOn[@w:val]')
link_path = XPath('./w:link[@w:val]')
def __init__(self, elem): def __init__(self, elem):
self.resolved = False
self.style_id = get(elem, 'w:styleId') self.style_id = get(elem, 'w:styleId')
self.style_type = get(elem, 'w:type') self.style_type = get(elem, 'w:type')
names = self.name_path(elem) names = self.name_path(elem)
@ -213,16 +64,57 @@ class Style(object):
self.based_on = get(based_on[0], 'w:val') if based_on else None self.based_on = get(based_on[0], 'w:val') if based_on else None
if self.style_type == 'numbering': if self.style_type == 'numbering':
self.based_on = None self.based_on = None
link = self.link_path(elem) self.is_default = get(elem, 'w:default') in {'1', 'on', 'true'}
self.link = get(link[0], 'w:val') if link else None
if self.style_type not in {'paragraph', 'character'}: self.paragraph_style = self.character_style = None
self.link = None
if self.style_type in {'paragraph', 'character'}:
if self.style_type == 'paragraph':
for pPr in XPath('./w:pPr')(elem):
ps = ParagraphStyle(pPr)
if self.paragraph_style is None:
self.paragraph_style = ps
else:
self.paragraph_style.update(ps)
for rPr in XPath('./w:rPr')(elem):
rs = RunStyle(rPr)
if self.character_style is None:
self.character_style = rs
else:
self.character_style.update(rs)
if self.style_type == 'numbering':
self.numbering_style_link = None
for x in XPath('./w:pPr/w:numPr/w:numId[@w:val]')(elem):
self.numbering_style_link = get(x, 'w:val')
def resolve_based_on(self, parent):
if parent.paragraph_style is not None:
if self.paragraph_style is None:
self.paragraph_style = ParagraphStyle()
self.paragraph_style.resolve_based_on(parent.paragraph_style)
if parent.character_style is not None:
if self.character_style is None:
self.character_style = RunStyle()
self.character_style.resolve_based_on(parent.character_style)
class Styles(object): class Styles(object):
'''
Collection of all styles defined in the document. Used to get the final styles applicable to elements in the document markup.
'''
def __init__(self): def __init__(self):
self.id_map = OrderedDict() self.id_map = OrderedDict()
self.para_cache = {}
self.para_char_cache = {}
self.run_cache = {}
self.classes = {}
self.counter = Counter()
self.default_styles = {}
self.numbering_style_links = {}
def __iter__(self): def __iter__(self):
for s in self.id_map.itervalues(): for s in self.id_map.itervalues():
@ -237,27 +129,279 @@ class Styles(object):
def get(self, key, default=None): def get(self, key, default=None):
return self.id_map.get(key, default) return self.id_map.get(key, default)
def __call__(self, root): def __call__(self, root, fonts):
self.fonts = fonts
for s in XPath('//w:style')(root): for s in XPath('//w:style')(root):
s = Style(s) s = Style(s)
if s.style_id: if s.style_id:
self.id_map[s.style_id] = s self.id_map[s.style_id] = s
if s.is_default:
self.default_styles[s.style_type] = s
if s.style_type == 'numbering' and s.numbering_style_link:
self.numbering_style_links[s.style_id] = s.numbering_style_link
self.default_paragraph_style = self.default_character_style = None
for dd in XPath('./w:docDefaults')(root):
for pd in XPath('./w:pPrDefault')(dd):
for pPr in XPath('./w:pPr')(pd):
ps = ParagraphStyle(pPr)
if self.default_paragraph_style is None:
self.default_paragraph_style = ps
else:
self.default_paragraph_style.update(ps)
for pd in XPath('./w:rPrDefault')(dd):
for pPr in XPath('./w:rPr')(pd):
ps = RunStyle(pPr)
if self.default_character_style is None:
self.default_character_style = ps
else:
self.default_character_style.update(ps)
def resolve(s, p):
if p is not None:
if not p.resolved:
resolve(p, self.get(p.based_on))
s.resolve_based_on(p)
s.resolved = True
# Nuke based_on, link attributes that refer to non-existing/incompatible
# parents
for s in self: for s in self:
bo = s.based_on if not s.resolved:
if bo is not None: resolve(s, self.get(s.based_on))
p = self.get(bo)
if p is None or p.style_type != s.style_type:
s.based_on = None
link = s.link
if link is not None:
p = self.get(link)
if p is None or (s.style_type, p.style_type) not in {('paragraph', 'character'), ('character', 'paragraph')}:
s.link = None
# TODO: Document defaults (docDefaults) def para_val(self, parent_styles, direct_formatting, attr):
val = getattr(direct_formatting, attr)
if val is inherit:
for ps in reversed(parent_styles):
pval = getattr(ps, attr)
if pval is not inherit:
val = pval
break
return val
def run_val(self, parent_styles, direct_formatting, attr):
val = getattr(direct_formatting, attr)
if val is not inherit:
return val
if attr in direct_formatting.toggle_properties:
val = False
for rs in parent_styles:
pval = getattr(rs, attr)
if pval is True:
val ^= True
return val
for rs in reversed(parent_styles):
rval = getattr(rs, attr)
if rval is not inherit:
return rval
return val
def resolve_paragraph(self, p):
ans = self.para_cache.get(p, None)
if ans is None:
ans = self.para_cache[p] = ParagraphStyle()
ans.style_name = None
direct_formatting = None
for pPr in XPath('./w:pPr')(p):
ps = ParagraphStyle(pPr)
if direct_formatting is None:
direct_formatting = ps
else:
direct_formatting.update(ps)
if direct_formatting is None:
direct_formatting = ParagraphStyle()
parent_styles = []
if self.default_paragraph_style is not None:
parent_styles.append(self.default_paragraph_style)
default_para = self.default_styles.get('paragraph', None)
if direct_formatting.linked_style is not None:
ls = self.get(direct_formatting.linked_style)
if ls is not None:
ans.style_name = ls.name
ps = ls.paragraph_style
if ps is not None:
parent_styles.append(ps)
if ls.character_style is not None:
self.para_char_cache[p] = ls.character_style
elif default_para is not None:
if default_para.paragraph_style is not None:
parent_styles.append(default_para.paragraph_style)
if default_para.character_style is not None:
self.para_char_cache[p] = default_para.character_style
is_numbering = direct_formatting.numbering is not inherit
if is_numbering:
num_id, lvl = direct_formatting.numbering
if num_id is not None:
p.set('calibre_num_id', '%s:%s' % (lvl, num_id))
if num_id is not None and lvl is not None:
ps = self.numbering.get_para_style(num_id, lvl)
if ps is not None:
parent_styles.append(ps)
for attr in ans.all_properties:
if not (is_numbering and attr == 'text_indent'): # skip text-indent for lists
setattr(ans, attr, self.para_val(parent_styles, direct_formatting, attr))
return ans
def resolve_run(self, r):
ans = self.run_cache.get(r, None)
if ans is None:
p = r.getparent()
ans = self.run_cache[r] = RunStyle()
direct_formatting = None
for rPr in XPath('./w:rPr')(r):
rs = RunStyle(rPr)
if direct_formatting is None:
direct_formatting = rs
else:
direct_formatting.update(rs)
if direct_formatting is None:
direct_formatting = RunStyle()
parent_styles = []
default_char = self.default_styles.get('character', None)
if self.default_character_style is not None:
parent_styles.append(self.default_character_style)
pstyle = self.para_char_cache.get(p, None)
if pstyle is not None:
parent_styles.append(pstyle)
if direct_formatting.linked_style is not None:
ls = self.get(direct_formatting.linked_style).character_style
if ls is not None:
parent_styles.append(ls)
elif default_char is not None and default_char.character_style is not None:
parent_styles.append(default_char.character_style)
for attr in ans.all_properties:
setattr(ans, attr, self.run_val(parent_styles, direct_formatting, attr))
if ans.font_family is not inherit:
ans.font_family = self.fonts.family_for(ans.font_family, ans.b, ans.i)
return ans
def resolve(self, obj):
if obj.tag.endswith('}p'):
return self.resolve_paragraph(obj)
if obj.tag.endswith('}r'):
return self.resolve_run(obj)
def cascade(self, layers):
self.body_font_family = 'serif'
self.body_font_size = '10pt'
for p, runs in layers.iteritems():
char_styles = [self.resolve_run(r) for r in runs]
block_style = self.resolve_paragraph(p)
c = Counter()
for s in char_styles:
if s.font_family is not inherit:
c[s.font_family] += 1
if c:
family = c.most_common(1)[0][0]
block_style.font_family = family
for s in char_styles:
if s.font_family == family:
s.font_family = inherit
sizes = [s.font_size for s in char_styles if s.font_size is not inherit]
if sizes:
sz = block_style.font_size = sizes[0]
for s in char_styles:
if s.font_size == sz:
s.font_size = inherit
block_styles = [self.resolve_paragraph(p) for p in layers]
c = Counter()
for s in block_styles:
if s.font_family is not inherit:
c[s.font_family] += 1
if c:
self.body_font_family = family = c.most_common(1)[0][0]
for s in block_styles:
if s.font_family == family:
s.font_family = inherit
c = Counter()
for s in block_styles:
if s.font_size is not inherit:
c[s.font_size] += 1
if c:
sz = c.most_common(1)[0][0]
for s in block_styles:
if s.font_size == sz:
s.font_size = inherit
self.body_font_size = '%.3gpt' % sz
def resolve_numbering(self, numbering):
# When a numPr element appears inside a paragraph style, the lvl info
# must be discarder and pStyle used instead.
self.numbering = numbering
for style in self:
ps = style.paragraph_style
if ps is not None and ps.numbering is not inherit:
lvl = numbering.get_pstyle(ps.numbering[0], style.style_id)
if lvl is None:
ps.numbering = inherit
else:
ps.numbering = (ps.numbering[0], lvl)
def register(self, css, prefix):
h = hash(frozenset(css.iteritems()))
ans, _ = self.classes.get(h, (None, None))
if ans is None:
self.counter[prefix] += 1
ans = '%s_%d' % (prefix, self.counter[prefix])
self.classes[h] = (ans, css)
return ans
def generate_classes(self):
for bs in self.para_cache.itervalues():
css = bs.css
if css:
self.register(css, 'block')
for bs in self.run_cache.itervalues():
css = bs.css
if css:
self.register(css, 'text')
def class_name(self, css):
h = hash(frozenset(css.iteritems()))
return self.classes.get(h, (None, None))[0]
def generate_css(self, dest_dir, docx):
ef = self.fonts.embed_fonts(dest_dir, docx)
prefix = textwrap.dedent(
'''\
body { font-family: %s; font-size: %s }
p { text-indent: 1.5em }
ul, ol, p { margin: 0; padding: 0 }
sup.noteref a { text-decoration: none }
h1.notes-header { page-break-before: always }
dl.notes dt { font-size: large }
dl.notes dt a { text-decoration: none }
dl.notes dd { page-break-after: always }
''') % (self.body_font_family, self.body_font_size)
if ef:
prefix = ef + '\n' + prefix
ans = []
for (cls, css) in sorted(self.classes.itervalues(), key=lambda x:x[0]):
b = ('\t%s: %s;' % (k, v) for k, v in css.iteritems())
b = '\n'.join(b)
ans.append('.%s {\n%s\n}\n' % (cls, b.rstrip(';')))
return prefix + '\n' + '\n'.join(ans)

View File

@ -6,15 +6,22 @@ from __future__ import (unicode_literals, division, absolute_import,
__license__ = 'GPL v3' __license__ = 'GPL v3'
__copyright__ = '2013, Kovid Goyal <kovid at kovidgoyal.net>' __copyright__ = '2013, Kovid Goyal <kovid at kovidgoyal.net>'
import sys, os import sys, os, re
from collections import OrderedDict, defaultdict
from lxml import html from lxml import html
from lxml.html.builder import ( from lxml.html.builder import (
HTML, HEAD, TITLE, BODY, LINK, META, P, SPAN, BR) HTML, HEAD, TITLE, BODY, LINK, META, P, SPAN, BR, DIV, SUP, A, DT, DL, DD, H1)
from calibre.ebooks.docx.container import DOCX, fromstring from calibre.ebooks.docx.container import DOCX, fromstring
from calibre.ebooks.docx.names import XPath, is_tag, barename, XML, STYLES from calibre.ebooks.docx.names import (
from calibre.ebooks.docx.styles import Styles XPath, is_tag, XML, STYLES, NUMBERING, FONTS, get, generate_anchor,
descendants, ancestor, FOOTNOTES, ENDNOTES)
from calibre.ebooks.docx.styles import Styles, inherit, PageProperties
from calibre.ebooks.docx.numbering import Numbering
from calibre.ebooks.docx.fonts import Fonts
from calibre.ebooks.docx.images import Images
from calibre.ebooks.docx.footnotes import Footnotes
from calibre.utils.localization import canonicalize_lang, lang_as_iso639_1 from calibre.utils.localization import canonicalize_lang, lang_as_iso639_1
class Text: class Text:
@ -28,13 +35,16 @@ class Text:
class Convert(object): class Convert(object):
def __init__(self, path_or_stream, dest_dir=None, log=None): def __init__(self, path_or_stream, dest_dir=None, log=None, notes_text=None):
self.docx = DOCX(path_or_stream, log=log) self.docx = DOCX(path_or_stream, log=log)
self.log = self.docx.log self.log = self.docx.log
self.notes_text = notes_text or _('Notes')
self.dest_dir = dest_dir or os.getcwdu() self.dest_dir = dest_dir or os.getcwdu()
self.mi = self.docx.metadata self.mi = self.docx.metadata
self.body = BODY() self.body = BODY()
self.styles = Styles() self.styles = Styles()
self.images = Images()
self.object_map = OrderedDict()
self.html = HTML( self.html = HTML(
HEAD( HEAD(
META(charset='utf-8'), META(charset='utf-8'),
@ -60,53 +70,264 @@ class Convert(object):
doc = self.docx.document doc = self.docx.document
relationships_by_id, relationships_by_type = self.docx.document_relationships relationships_by_id, relationships_by_type = self.docx.document_relationships
self.read_styles(relationships_by_type) self.read_styles(relationships_by_type)
for top_level in XPath('/w:document/w:body/*')(doc): self.images(relationships_by_id)
if is_tag(top_level, 'w:p'): self.layers = OrderedDict()
p = self.convert_p(top_level) self.framed = [[]]
self.body.append(p) self.framed_map = {}
elif is_tag(top_level, 'w:tbl'): self.anchor_map = {}
pass # TODO: tables self.link_map = defaultdict(list)
elif is_tag(top_level, 'w:sectPr'):
pass # TODO: Last section properties self.read_page_properties(doc)
else: for wp, page_properties in self.page_map.iteritems():
self.log.debug('Unknown top-level tag: %s, ignoring' % barename(top_level.tag)) self.current_page = page_properties
p = self.convert_p(wp)
self.body.append(p)
if self.footnotes.has_notes:
dl = DL()
dl.set('class', 'notes')
self.body.append(H1(self.notes_text))
self.body[-1].set('class', 'notes-header')
self.body.append(dl)
for anchor, text, note in self.footnotes:
dl.append(DT('[', A('' + text, href='#back_%s' % anchor, title=text), id=anchor))
dl[-1][0].tail = ']'
dl.append(DD())
for wp in note:
p = self.convert_p(wp)
dl[-1].append(p)
self.resolve_links(relationships_by_id)
# TODO: tables <w:tbl> child of <w:body> (nested tables?)
self.styles.cascade(self.layers)
numbered = []
for html_obj, obj in self.object_map.iteritems():
raw = obj.get('calibre_num_id', None)
if raw is not None:
lvl, num_id = raw.partition(':')[0::2]
try:
lvl = int(lvl)
except (TypeError, ValueError):
lvl = 0
numbered.append((html_obj, num_id, lvl))
self.numbering.apply_markup(numbered, self.body, self.styles, self.object_map)
self.apply_frames()
if len(self.body) > 0: if len(self.body) > 0:
self.body.text = '\n\t' self.body.text = '\n\t'
for child in self.body: for child in self.body:
child.tail = '\n\t' child.tail = '\n\t'
self.body[-1].tail = '\n' self.body[-1].tail = '\n'
self.styles.generate_classes()
for html_obj, obj in self.object_map.iteritems():
style = self.styles.resolve(obj)
if style is not None:
css = style.css
if css:
cls = self.styles.class_name(css)
if cls:
html_obj.set('class', cls)
for html_obj, css in self.framed_map.iteritems():
cls = self.styles.class_name(css)
if cls:
html_obj.set('class', cls)
self.write() self.write()
def read_page_properties(self, doc):
current = []
self.page_map = OrderedDict()
for p in descendants(doc, 'w:p'):
sect = tuple(descendants(p, 'w:sectPr'))
if sect:
pr = PageProperties(sect)
for x in current + [p]:
self.page_map[x] = pr
current = []
else:
current.append(p)
if current:
last = XPath('./w:body/w:sectPr')(doc)
pr = PageProperties(last)
for x in current:
self.page_map[x] = pr
def read_styles(self, relationships_by_type): def read_styles(self, relationships_by_type):
sname = relationships_by_type.get(STYLES, None)
if sname is None: def get_name(rtype, defname):
name = self.docx.document_name.split('/') name = relationships_by_type.get(rtype, None)
name[-1] = 'styles.xml' if name is None:
if self.docx.exists(name): cname = self.docx.document_name.split('/')
sname = name cname[-1] = defname
if self.docx.exists('/'.join(cname)):
name = name
return name
nname = get_name(NUMBERING, 'numbering.xml')
sname = get_name(STYLES, 'styles.xml')
fname = get_name(FONTS, 'fontTable.xml')
foname = get_name(FOOTNOTES, 'footnotes.xml')
enname = get_name(ENDNOTES, 'endnotes.xml')
numbering = self.numbering = Numbering()
footnotes = self.footnotes = Footnotes()
fonts = self.fonts = Fonts()
foraw = enraw = None
if foname is not None:
try:
foraw = self.docx.read(foname)
except KeyError:
self.log.warn('Footnotes %s do not exist' % foname)
if enname is not None:
try:
enraw = self.docx.read(enname)
except KeyError:
self.log.warn('Endnotes %s do not exist' % enname)
footnotes(fromstring(foraw) if foraw else None, fromstring(enraw) if enraw else None)
if fname is not None:
embed_relationships = self.docx.get_relationships(fname)[0]
try:
raw = self.docx.read(fname)
except KeyError:
self.log.warn('Fonts table %s does not exist' % fname)
else:
fonts(fromstring(raw), embed_relationships, self.docx, self.dest_dir)
if sname is not None: if sname is not None:
try: try:
raw = self.docx.read(sname) raw = self.docx.read(sname)
except KeyError: except KeyError:
self.log.warn('Styles %s do not exist' % sname) self.log.warn('Styles %s do not exist' % sname)
else: else:
self.styles(fromstring(raw)) self.styles(fromstring(raw), fonts)
if nname is not None:
try:
raw = self.docx.read(nname)
except KeyError:
self.log.warn('Numbering styles %s do not exist' % nname)
else:
numbering(fromstring(raw), self.styles)
self.styles.resolve_numbering(numbering)
def write(self): def write(self):
raw = html.tostring(self.html, encoding='utf-8', doctype='<!DOCTYPE html>') raw = html.tostring(self.html, encoding='utf-8', doctype='<!DOCTYPE html>')
with open(os.path.join(self.dest_dir, 'index.html'), 'wb') as f: with open(os.path.join(self.dest_dir, 'index.html'), 'wb') as f:
f.write(raw) f.write(raw)
css = self.styles.generate_css(self.dest_dir, self.docx)
if css:
with open(os.path.join(self.dest_dir, 'docx.css'), 'wb') as f:
f.write(css.encode('utf-8'))
def convert_p(self, p): def convert_p(self, p):
dest = P() dest = P()
for run in XPath('descendant::w:r')(p): self.object_map[dest] = p
span = self.convert_run(run) style = self.styles.resolve_paragraph(p)
dest.append(span) self.layers[p] = []
self.add_frame(dest, style.frame)
current_anchor = None
current_hyperlink = None
for x in descendants(p, 'w:r', 'w:bookmarkStart', 'w:hyperlink'):
if x.tag.endswith('}r'):
span = self.convert_run(x)
if current_anchor is not None:
(dest if len(dest) == 0 else span).set('id', current_anchor)
current_anchor = None
if current_hyperlink is not None:
hl = ancestor(x, 'w:hyperlink')
if hl is not None:
self.link_map[hl].append(span)
else:
current_hyperlink = None
dest.append(span)
self.layers[p].append(x)
elif x.tag.endswith('}bookmarkStart'):
anchor = get(x, 'w:name')
if anchor and anchor not in self.anchor_map:
self.anchor_map[anchor] = current_anchor = generate_anchor(anchor, frozenset(self.anchor_map.itervalues()))
elif x.tag.endswith('}hyperlink'):
current_hyperlink = x
m = re.match(r'heading\s+(\d+)$', style.style_name or '', re.IGNORECASE)
if m is not None:
n = min(1, max(6, int(m.group(1))))
dest.tag = 'h%d' % n
if style.direction == 'rtl':
dest.set('dir', 'rtl')
border_runs = []
common_borders = []
for span in dest:
run = self.object_map[span]
style = self.styles.resolve_run(run)
if not border_runs or border_runs[-1][1].same_border(style):
border_runs.append((span, style))
elif border_runs:
if len(border_runs) > 1:
common_borders.append(border_runs)
border_runs = []
for border_run in common_borders:
spans = []
bs = {}
for span, style in border_run:
style.get_border_css(bs)
style.clear_border_css()
spans.append(span)
if bs:
cls = self.styles.register(bs, 'text_border')
wrapper = self.wrap_elems(spans, SPAN())
wrapper.set('class', cls)
return dest return dest
def wrap_elems(self, elems, wrapper):
p = elems[0].getparent()
idx = p.index(elems[0])
p.insert(idx, wrapper)
wrapper.tail = elems[-1].tail
elems[-1].tail = None
for elem in elems:
p.remove(elem)
wrapper.append(elem)
return wrapper
def resolve_links(self, relationships_by_id):
for hyperlink, spans in self.link_map.iteritems():
span = spans[0]
if len(spans) > 1:
span = self.wrap_elems(spans, SPAN())
span.tag = 'a'
tgt = get(hyperlink, 'w:tgtFrame')
if tgt:
span.set('target', tgt)
tt = get(hyperlink, 'w:tooltip')
if tt:
span.set('title', tt)
rid = get(hyperlink, 'r:id')
if rid and rid in relationships_by_id:
span.set('href', relationships_by_id[rid])
continue
anchor = get(hyperlink, 'w:anchor')
if anchor and anchor in self.anchor_map:
span.set('href', '#' + self.anchor_map[anchor])
continue
self.log.warn('Hyperlink with unknown target (%s, %s), ignoring' %
(rid, anchor))
span.set('href', '#')
def convert_run(self, run): def convert_run(self, run):
ans = SPAN() ans = SPAN()
self.object_map[ans] = run
text = Text(ans, 'text', []) text = Text(ans, 'text', [])
for child in run: for child in run:
@ -121,6 +342,7 @@ class Convert(object):
text.buf.append(child.text) text.buf.append(child.text)
elif is_tag(child, 'w:cr'): elif is_tag(child, 'w:cr'):
text.add_elem(BR()) text.add_elem(BR())
ans.append(text.elem)
elif is_tag(child, 'w:br'): elif is_tag(child, 'w:br'):
typ = child.get('type', None) typ = child.get('type', None)
if typ in {'column', 'page'}: if typ in {'column', 'page'}:
@ -132,11 +354,56 @@ class Convert(object):
else: else:
br = BR() br = BR()
text.add_elem(br) text.add_elem(br)
ans.append(text.elem)
elif is_tag(child, 'w:drawing') or is_tag(child, 'w:pict'):
for img in self.images.to_html(child, self.current_page, self.docx, self.dest_dir):
text.add_elem(img)
ans.append(text.elem)
elif is_tag(child, 'w:footnoteReference') or is_tag(child, 'w:endnoteReference'):
anchor, name = self.footnotes.get_ref(child)
if anchor and name:
l = SUP(A(name, href='#' + anchor, title=name), id='back_%s' % anchor)
l.set('class', 'noteref')
text.add_elem(l)
ans.append(text.elem)
if text.buf: if text.buf:
setattr(text.elem, text.attr, ''.join(text.buf)) setattr(text.elem, text.attr, ''.join(text.buf))
style = self.styles.resolve_run(run)
if style.vert_align in {'superscript', 'subscript'}:
ans.tag = 'sub' if style.vert_align == 'subscript' else 'sup'
if style.lang is not inherit:
ans.lang = style.lang
return ans return ans
def add_frame(self, html_obj, style):
last_run = self.framed[-1]
if style is inherit:
if last_run:
self.framed.append([])
return
if last_run:
if last_run[-1][1] == style:
last_run.append((html_obj, style))
else:
self.framed.append((html_obj, style))
else:
last_run.append((html_obj, style))
def apply_frames(self):
for run in filter(None, self.framed):
style = run[0][1]
paras = tuple(x[0] for x in run)
parent = paras[0].getparent()
idx = parent.index(paras[0])
frame = DIV(*paras)
parent.insert(idx, frame)
self.framed_map[frame] = css = style.css(self.page_map[self.object_map[paras[0]]])
self.styles.register(css, 'frame')
if __name__ == '__main__': if __name__ == '__main__':
from calibre.utils.logging import default_log from calibre.utils.logging import default_log
default_log.filter_level = default_log.DEBUG default_log.filter_level = default_log.DEBUG
Convert(sys.argv[-1], log=default_log)() Convert(sys.argv[-1], log=default_log)()

View File

@ -136,7 +136,7 @@ class FB2MLizer(object):
metadata['author'] += '<last-name>%s</last-name>' % prepare_string_for_xml(author_last) metadata['author'] += '<last-name>%s</last-name>' % prepare_string_for_xml(author_last)
metadata['author'] += '</author>' metadata['author'] += '</author>'
if not metadata['author']: if not metadata['author']:
metadata['author'] = u'<author><first-name></first-name><last-name><last-name></author>' metadata['author'] = u'<author><first-name></first-name><last-name></last-name></author>'
metadata['keywords'] = u'' metadata['keywords'] = u''
tags = list(map(unicode, self.oeb_book.metadata.subject)) tags = list(map(unicode, self.oeb_book.metadata.subject))

View File

@ -21,7 +21,7 @@ from calibre.ebooks.metadata.book.base import Metadata
from calibre.utils.date import parse_date, isoformat from calibre.utils.date import parse_date, isoformat
from calibre.utils.localization import get_lang, canonicalize_lang from calibre.utils.localization import get_lang, canonicalize_lang
from calibre import prints, guess_type from calibre import prints, guess_type
from calibre.utils.cleantext import clean_ascii_chars from calibre.utils.cleantext import clean_ascii_chars, clean_xml_chars
from calibre.utils.config import tweaks from calibre.utils.config import tweaks
class Resource(object): # {{{ class Resource(object): # {{{
@ -560,7 +560,9 @@ class OPF(object): # {{{
self.package_version = 0 self.package_version = 0
self.metadata = self.metadata_path(self.root) self.metadata = self.metadata_path(self.root)
if not self.metadata: if not self.metadata:
raise ValueError('Malformed OPF file: No <metadata> element') self.metadata = [self.root.makeelement('{http://www.idpf.org/2007/opf}metadata')]
self.root.insert(0, self.metadata[0])
self.metadata[0].tail = '\n'
self.metadata = self.metadata[0] self.metadata = self.metadata[0]
if unquote_urls: if unquote_urls:
self.unquote_urls() self.unquote_urls()
@ -1434,7 +1436,10 @@ def metadata_to_opf(mi, as_string=True, default_lang=None):
attrib['name'] = name attrib['name'] = name
if content: if content:
attrib['content'] = content attrib['content'] = content
elem = metadata.makeelement(tag, attrib=attrib) try:
elem = metadata.makeelement(tag, attrib=attrib)
except ValueError:
elem = metadata.makeelement(tag, attrib={k:clean_xml_chars(v) for k, v in attrib.iteritems()})
elem.tail = '\n'+(' '*8) elem.tail = '\n'+(' '*8)
if text: if text:
try: try:

View File

@ -163,7 +163,8 @@ class MOBIFile(object):
ext = 'dat' ext = 'dat'
prefix = 'binary' prefix = 'binary'
suffix = '' suffix = ''
if sig in {b'HUFF', b'CDIC', b'INDX'}: continue if sig in {b'HUFF', b'CDIC', b'INDX'}:
continue
# TODO: Ignore CNCX records as well # TODO: Ignore CNCX records as well
if sig == b'FONT': if sig == b'FONT':
font = read_font_record(rec.raw) font = read_font_record(rec.raw)
@ -196,7 +197,6 @@ class MOBIFile(object):
vals = list(index)[:-1] + [None, None, None, None] vals = list(index)[:-1] + [None, None, None, None]
entry_map.append(Entry(*(vals[:12]))) entry_map.append(Entry(*(vals[:12])))
indexing_data = collect_indexing_data(entry_map, list(map(len, indexing_data = collect_indexing_data(entry_map, list(map(len,
self.text_records))) self.text_records)))
self.indexing_data = [DOC + '\n' +textwrap.dedent('''\ self.indexing_data = [DOC + '\n' +textwrap.dedent('''\

View File

@ -16,7 +16,8 @@ from calibre.ebooks.oeb.transforms.flatcss import KeyMapper
from calibre.utils.magick.draw import identify_data from calibre.utils.magick.draw import identify_data
MBP_NS = 'http://mobipocket.com/ns/mbp' MBP_NS = 'http://mobipocket.com/ns/mbp'
def MBP(name): return '{%s}%s' % (MBP_NS, name) def MBP(name):
return '{%s}%s' % (MBP_NS, name)
MOBI_NSMAP = {None: XHTML_NS, 'mbp': MBP_NS} MOBI_NSMAP = {None: XHTML_NS, 'mbp': MBP_NS}
@ -413,7 +414,7 @@ class MobiMLizer(object):
# img sizes in units other than px # img sizes in units other than px
# See #7520 for test case # See #7520 for test case
try: try:
pixs = int(round(float(value) / \ pixs = int(round(float(value) /
(72./self.profile.dpi))) (72./self.profile.dpi)))
except: except:
continue continue
@ -488,8 +489,6 @@ class MobiMLizer(object):
if elem.text: if elem.text:
if istate.preserve: if istate.preserve:
text = elem.text text = elem.text
elif len(elem) > 0 and isspace(elem.text):
text = None
else: else:
text = COLLAPSE.sub(' ', elem.text) text = COLLAPSE.sub(' ', elem.text)
valign = style['vertical-align'] valign = style['vertical-align']

View File

@ -181,9 +181,9 @@ class BookHeader(object):
self.codec = 'cp1252' if not user_encoding else user_encoding self.codec = 'cp1252' if not user_encoding else user_encoding
log.warn('Unknown codepage %d. Assuming %s' % (self.codepage, log.warn('Unknown codepage %d. Assuming %s' % (self.codepage,
self.codec)) self.codec))
# Some KF8 files have header length == 256 (generated by kindlegen # Some KF8 files have header length == 264 (generated by kindlegen
# 2.7?). See https://bugs.launchpad.net/bugs/1067310 # 2.9?). See https://bugs.launchpad.net/bugs/1179144
max_header_length = 0x100 max_header_length = 500 # We choose 500 for future versions of kindlegen
if (ident == 'TEXTREAD' or self.length < 0xE4 or if (ident == 'TEXTREAD' or self.length < 0xE4 or
self.length > max_header_length or self.length > max_header_length or

View File

@ -100,7 +100,7 @@ def update_flow_links(mobi8_reader, resource_map, log):
mr = mobi8_reader mr = mobi8_reader
flows = [] flows = []
img_pattern = re.compile(r'''(<[img\s|image\s][^>]*>)''', re.IGNORECASE) img_pattern = re.compile(r'''(<[img\s|image\s|svg:image\s][^>]*>)''', re.IGNORECASE)
img_index_pattern = re.compile(r'''['"]kindle:embed:([0-9|A-V]+)[^'"]*['"]''', re.IGNORECASE) img_index_pattern = re.compile(r'''['"]kindle:embed:([0-9|A-V]+)[^'"]*['"]''', re.IGNORECASE)
tag_pattern = re.compile(r'''(<[^>]*>)''') tag_pattern = re.compile(r'''(<[^>]*>)''')
@ -112,7 +112,7 @@ def update_flow_links(mobi8_reader, resource_map, log):
url_css_index_pattern = re.compile(r'''kindle:flow:([0-9|A-V]+)\?mime=text/css[^\)]*''', re.IGNORECASE) url_css_index_pattern = re.compile(r'''kindle:flow:([0-9|A-V]+)\?mime=text/css[^\)]*''', re.IGNORECASE)
for flow in mr.flows: for flow in mr.flows:
if flow is None: # 0th flow is None if flow is None: # 0th flow is None
flows.append(flow) flows.append(flow)
continue continue
@ -128,7 +128,7 @@ def update_flow_links(mobi8_reader, resource_map, log):
srcpieces = img_pattern.split(flow) srcpieces = img_pattern.split(flow)
for j in range(1, len(srcpieces), 2): for j in range(1, len(srcpieces), 2):
tag = srcpieces[j] tag = srcpieces[j]
if tag.startswith('<im'): if tag.startswith('<im') or tag.startswith('<svg:image'):
for m in img_index_pattern.finditer(tag): for m in img_index_pattern.finditer(tag):
num = int(m.group(1), 32) num = int(m.group(1), 32)
href = resource_map[num-1] href = resource_map[num-1]
@ -330,7 +330,7 @@ def expand_mobi8_markup(mobi8_reader, resource_map, log):
mobi8_reader.flows = flows mobi8_reader.flows = flows
# write out the parts and file flows # write out the parts and file flows
os.mkdir('text') # directory containing all parts os.mkdir('text') # directory containing all parts
spine = [] spine = []
for i, part in enumerate(parts): for i, part in enumerate(parts):
pi = mobi8_reader.partinfo[i] pi = mobi8_reader.partinfo[i]

View File

@ -228,7 +228,7 @@ class Mobi8Reader(object):
self.flowinfo.append(FlowInfo(None, None, None, None)) self.flowinfo.append(FlowInfo(None, None, None, None))
svg_tag_pattern = re.compile(br'''(<svg[^>]*>)''', re.IGNORECASE) svg_tag_pattern = re.compile(br'''(<svg[^>]*>)''', re.IGNORECASE)
image_tag_pattern = re.compile(br'''(<image[^>]*>)''', re.IGNORECASE) image_tag_pattern = re.compile(br'''(<(?:svg:)?image[^>]*>)''', re.IGNORECASE)
for j in xrange(1, len(self.flows)): for j in xrange(1, len(self.flows)):
flowpart = self.flows[j] flowpart = self.flows[j]
nstr = '%04d' % j nstr = '%04d' % j
@ -243,7 +243,7 @@ class Mobi8Reader(object):
dir = None dir = None
fname = None fname = None
# strip off anything before <svg if inlining # strip off anything before <svg if inlining
flowpart = flowpart[start:] flowpart = re.sub(br'(</?)svg:', r'\1', flowpart[start:])
else: else:
format = 'file' format = 'file'
dir = "images" dir = "images"

View File

@ -373,7 +373,7 @@ def urlquote(href):
result.append(char) result.append(char)
return ''.join(result) return ''.join(result)
def urlunquote(href): def urlunquote(href, error_handling='strict'):
# unquote must run on a bytestring and will return a bytestring # unquote must run on a bytestring and will return a bytestring
# If it runs on a unicode object, it returns a double encoded unicode # If it runs on a unicode object, it returns a double encoded unicode
# string: unquote(u'%C3%A4') != unquote(b'%C3%A4').decode('utf-8') # string: unquote(u'%C3%A4') != unquote(b'%C3%A4').decode('utf-8')
@ -383,7 +383,10 @@ def urlunquote(href):
href = href.encode('utf-8') href = href.encode('utf-8')
href = unquote(href) href = unquote(href)
if want_unicode: if want_unicode:
href = href.decode('utf-8') # The quoted characters could have been in some encoding other than
# UTF-8, this often happens with old/broken web servers. There is no
# way to know what that encoding should be in this context.
href = href.decode('utf-8', error_handling)
return href return href
def urlnormalize(href): def urlnormalize(href):
@ -871,6 +874,7 @@ class Manifest(object):
orig_data = data orig_data = data
fname = urlunquote(self.href) fname = urlunquote(self.href)
self.oeb.log.debug('Parsing', fname, '...') self.oeb.log.debug('Parsing', fname, '...')
self.oeb.html_preprocessor.current_href = self.href
try: try:
data = parse_html(data, log=self.oeb.log, data = parse_html(data, log=self.oeb.log,
decoder=self.oeb.decode, decoder=self.oeb.decode,
@ -1312,9 +1316,9 @@ class Guide(object):
('notes', __('Notes')), ('notes', __('Notes')),
('preface', __('Preface')), ('preface', __('Preface')),
('text', __('Main Text'))] ('text', __('Main Text'))]
TYPES = set(t for t, _ in _TYPES_TITLES) TYPES = set(t for t, _ in _TYPES_TITLES) # noqa
TITLES = dict(_TYPES_TITLES) TITLES = dict(_TYPES_TITLES)
ORDER = dict((t, i) for i, (t, _) in enumerate(_TYPES_TITLES)) ORDER = dict((t, i) for i, (t, _) in enumerate(_TYPES_TITLES)) # noqa
def __init__(self, oeb, type, title, href): def __init__(self, oeb, type, title, href):
self.oeb = oeb self.oeb = oeb

View File

@ -11,7 +11,7 @@ import re
from calibre import guess_type from calibre import guess_type
class EntityDeclarationProcessor(object): # {{{ class EntityDeclarationProcessor(object): # {{{
def __init__(self, html): def __init__(self, html):
self.declared_entities = {} self.declared_entities = {}
@ -51,7 +51,7 @@ def load_html(path, view, codec='utf-8', mime_type=None,
loading_url = QUrl.fromLocalFile(path) loading_url = QUrl.fromLocalFile(path)
pre_load_callback(loading_url) pre_load_callback(loading_url)
if force_as_html or re.search(r'<[:a-zA-Z0-9-]*svg', html) is None: if force_as_html or re.search(r'<[a-zA-Z0-9-]+:svg', html) is None:
view.setHtml(html, loading_url) view.setHtml(html, loading_url)
else: else:
view.setContent(QByteArray(html.encode(codec)), mime_type, view.setContent(QByteArray(html.encode(codec)), mime_type,
@ -61,4 +61,3 @@ def load_html(path, view, codec='utf-8', mime_type=None,
if not elem.isNull(): if not elem.isNull():
return False return False
return True return True

View File

@ -7,7 +7,7 @@ __license__ = 'GPL v3'
__copyright__ = '2012, Kovid Goyal <kovid@kovidgoyal.net>' __copyright__ = '2012, Kovid Goyal <kovid@kovidgoyal.net>'
__docformat__ = 'restructuredtext en' __docformat__ = 'restructuredtext en'
import os, re import sys, os, re
from calibre.customize.ui import available_input_formats from calibre.customize.ui import available_input_formats
@ -26,17 +26,18 @@ def EbookIterator(*args, **kwargs):
from calibre.ebooks.oeb.iterator.book import EbookIterator from calibre.ebooks.oeb.iterator.book import EbookIterator
return EbookIterator(*args, **kwargs) return EbookIterator(*args, **kwargs)
def get_preprocess_html(path_to_ebook, output): def get_preprocess_html(path_to_ebook, output=None):
from calibre.ebooks.conversion.preprocess import HTMLPreProcessor from calibre.ebooks.conversion.plumber import set_regex_wizard_callback, Plumber
iterator = EbookIterator(path_to_ebook) from calibre.utils.logging import DevNull
iterator.__enter__(only_input_plugin=True, run_char_count=False, from calibre.ptempfile import TemporaryDirectory
read_anchor_map=False) raw = {}
preprocessor = HTMLPreProcessor(None, False) set_regex_wizard_callback(raw.__setitem__)
with open(output, 'wb') as out: with TemporaryDirectory('_regex_wiz') as tdir:
for path in iterator.spine: pl = Plumber(path_to_ebook, os.path.join(tdir, 'a.epub'), DevNull(), for_regex_wizard=True)
with open(path, 'rb') as f: pl.run()
html = f.read().decode('utf-8', 'replace') items = [raw[item.href] for item in pl.oeb.spine if item.href in raw]
html = preprocessor(html, get_preprocess_html=True)
with (sys.stdout if output is None else open(output, 'wb')) as out:
for html in items:
out.write(html.encode('utf-8')) out.write(html.encode('utf-8'))
out.write(b'\n\n' + b'-'*80 + b'\n\n') out.write(b'\n\n' + b'-'*80 + b'\n\n')

View File

@ -25,7 +25,7 @@ from calibre.ebooks.oeb.transforms.cover import CoverManager
from calibre.ebooks.oeb.iterator.spine import (SpineItem, create_indexing_data) from calibre.ebooks.oeb.iterator.spine import (SpineItem, create_indexing_data)
from calibre.ebooks.oeb.iterator.bookmarks import BookmarksMixin from calibre.ebooks.oeb.iterator.bookmarks import BookmarksMixin
TITLEPAGE = CoverManager.SVG_TEMPLATE.decode('utf-8').replace(\ TITLEPAGE = CoverManager.SVG_TEMPLATE.decode('utf-8').replace(
'__ar__', 'none').replace('__viewbox__', '0 0 600 800' '__ar__', 'none').replace('__viewbox__', '0 0 600 800'
).replace('__width__', '600').replace('__height__', '800') ).replace('__width__', '600').replace('__height__', '800')

View File

@ -44,8 +44,10 @@ META_XP = XPath('/h:html/h:head/h:meta[@http-equiv="Content-Type"]')
def merge_multiple_html_heads_and_bodies(root, log=None): def merge_multiple_html_heads_and_bodies(root, log=None):
heads, bodies = xpath(root, '//h:head'), xpath(root, '//h:body') heads, bodies = xpath(root, '//h:head'), xpath(root, '//h:body')
if not (len(heads) > 1 or len(bodies) > 1): return root if not (len(heads) > 1 or len(bodies) > 1):
for child in root: root.remove(child) return root
for child in root:
root.remove(child)
head = root.makeelement(XHTML('head')) head = root.makeelement(XHTML('head'))
body = root.makeelement(XHTML('body')) body = root.makeelement(XHTML('body'))
for h in heads: for h in heads:
@ -88,7 +90,7 @@ def html5_parse(data, max_nesting_depth=100):
# Check that the asinine HTML 5 algorithm did not result in a tree with # Check that the asinine HTML 5 algorithm did not result in a tree with
# insane nesting depths # insane nesting depths
for x in data.iterdescendants(): for x in data.iterdescendants():
if isinstance(x.tag, basestring) and len(x) is 0: # Leaf node if isinstance(x.tag, basestring) and len(x) is 0: # Leaf node
depth = node_depth(x) depth = node_depth(x)
if depth > max_nesting_depth: if depth > max_nesting_depth:
raise ValueError('html5lib resulted in a tree with nesting' raise ValueError('html5lib resulted in a tree with nesting'
@ -228,7 +230,7 @@ def parse_html(data, log=None, decoder=None, preprocessor=None,
if idx > -1: if idx > -1:
pre = data[:idx] pre = data[:idx]
data = data[idx:] data = data[idx:]
if '<!DOCTYPE' in pre: # Handle user defined entities if '<!DOCTYPE' in pre: # Handle user defined entities
user_entities = {} user_entities = {}
for match in re.finditer(r'<!ENTITY\s+(\S+)\s+([^>]+)', pre): for match in re.finditer(r'<!ENTITY\s+(\S+)\s+([^>]+)', pre):
val = match.group(2) val = match.group(2)
@ -368,8 +370,7 @@ def parse_html(data, log=None, decoder=None, preprocessor=None,
meta.getparent().remove(meta) meta.getparent().remove(meta)
meta = etree.SubElement(head, XHTML('meta'), meta = etree.SubElement(head, XHTML('meta'),
attrib={'http-equiv': 'Content-Type'}) attrib={'http-equiv': 'Content-Type'})
meta.set('content', 'text/html; charset=utf-8') # Ensure content is second meta.set('content', 'text/html; charset=utf-8') # Ensure content is second attribute
# attribute
# Ensure has a <body/> # Ensure has a <body/>
if not xpath(data, '/h:html/h:body'): if not xpath(data, '/h:html/h:body'):

View File

@ -9,7 +9,7 @@ __docformat__ = 'restructuredtext en'
import re import re
from urlparse import urlparse from urlparse import urlparse
from collections import deque from collections import deque, Counter
from functools import partial from functools import partial
from lxml import etree from lxml import etree
@ -29,7 +29,8 @@ class TOC(object):
def __init__(self, title=None, dest=None, frag=None): def __init__(self, title=None, dest=None, frag=None):
self.title, self.dest, self.frag = title, dest, frag self.title, self.dest, self.frag = title, dest, frag
self.dest_exists = self.dest_error = None self.dest_exists = self.dest_error = None
if self.title: self.title = self.title.strip() if self.title:
self.title = self.title.strip()
self.parent = None self.parent = None
self.children = [] self.children = []
@ -326,11 +327,13 @@ def create_ncx(toc, to_href, btitle, lang, uid):
navmap = etree.SubElement(ncx, NCX('navMap')) navmap = etree.SubElement(ncx, NCX('navMap'))
spat = re.compile(r'\s+') spat = re.compile(r'\s+')
def process_node(xml_parent, toc_parent, play_order=0): play_order = Counter()
def process_node(xml_parent, toc_parent):
for child in toc_parent: for child in toc_parent:
play_order += 1 play_order['c'] += 1
point = etree.SubElement(xml_parent, NCX('navPoint'), id=uuid_id(), point = etree.SubElement(xml_parent, NCX('navPoint'), id=uuid_id(),
playOrder=str(play_order)) playOrder=str(play_order['c']))
label = etree.SubElement(point, NCX('navLabel')) label = etree.SubElement(point, NCX('navLabel'))
title = child.title title = child.title
if title: if title:
@ -341,7 +344,7 @@ def create_ncx(toc, to_href, btitle, lang, uid):
if child.frag: if child.frag:
href += '#'+child.frag href += '#'+child.frag
etree.SubElement(point, NCX('content'), src=href) etree.SubElement(point, NCX('content'), src=href)
process_node(point, child, play_order) process_node(point, child)
process_node(navmap, toc) process_node(navmap, toc)
return ncx return ncx

View File

@ -32,7 +32,8 @@ def dynamic_rescale_factor(node):
classes = node.get('class', '').split(' ') classes = node.get('class', '').split(' ')
classes = [x.replace('calibre_rescale_', '') for x in classes if classes = [x.replace('calibre_rescale_', '') for x in classes if
x.startswith('calibre_rescale_')] x.startswith('calibre_rescale_')]
if not classes: return None if not classes:
return None
factor = 1.0 factor = 1.0
for x in classes: for x in classes:
try: try:
@ -54,7 +55,8 @@ class KeyMapper(object):
return base return base
size = float(size) size = float(size)
base = float(base) base = float(base)
if abs(size - base) < 0.1: return 0 if abs(size - base) < 0.1:
return 0
sign = -1 if size < base else 1 sign = -1 if size < base else 1
endp = 0 if size < base else 36 endp = 0 if size < base else 36
diff = (abs(base - size) * 3) + ((36 - size) / 100) diff = (abs(base - size) * 3) + ((36 - size) / 100)
@ -110,7 +112,8 @@ class EmbedFontsCSSRules(object):
self.href = None self.href = None
def __call__(self, oeb): def __call__(self, oeb):
if not self.body_font_family: return None if not self.body_font_family:
return None
if not self.href: if not self.href:
iid, href = oeb.manifest.generate(u'page_styles', u'page_styles.css') iid, href = oeb.manifest.generate(u'page_styles', u'page_styles.css')
rules = [x.cssText for x in self.rules] rules = [x.cssText for x in self.rules]
@ -228,10 +231,10 @@ class CSSFlattener(object):
bs.append('margin-top: 0pt') bs.append('margin-top: 0pt')
bs.append('margin-bottom: 0pt') bs.append('margin-bottom: 0pt')
if float(self.context.margin_left) >= 0: if float(self.context.margin_left) >= 0:
bs.append('margin-left : %gpt'%\ bs.append('margin-left : %gpt'%
float(self.context.margin_left)) float(self.context.margin_left))
if float(self.context.margin_right) >= 0: if float(self.context.margin_right) >= 0:
bs.append('margin-right : %gpt'%\ bs.append('margin-right : %gpt'%
float(self.context.margin_right)) float(self.context.margin_right))
bs.extend(['padding-left: 0pt', 'padding-right: 0pt']) bs.extend(['padding-left: 0pt', 'padding-right: 0pt'])
if self.page_break_on_body: if self.page_break_on_body:
@ -277,8 +280,10 @@ class CSSFlattener(object):
for kind in ('margin', 'padding'): for kind in ('margin', 'padding'):
for edge in ('bottom', 'top'): for edge in ('bottom', 'top'):
property = "%s-%s" % (kind, edge) property = "%s-%s" % (kind, edge)
if property not in cssdict: continue if property not in cssdict:
if '%' in cssdict[property]: continue continue
if '%' in cssdict[property]:
continue
value = style[property] value = style[property]
if value == 0: if value == 0:
continue continue
@ -296,7 +301,7 @@ class CSSFlattener(object):
def flatten_node(self, node, stylizer, names, styles, pseudo_styles, psize, item_id): def flatten_node(self, node, stylizer, names, styles, pseudo_styles, psize, item_id):
if not isinstance(node.tag, basestring) \ if not isinstance(node.tag, basestring) \
or namespace(node.tag) != XHTML_NS: or namespace(node.tag) != XHTML_NS:
return return
tag = barename(node.tag) tag = barename(node.tag)
style = stylizer.style(node) style = stylizer.style(node)
cssdict = style.cssdict() cssdict = style.cssdict()
@ -360,12 +365,17 @@ class CSSFlattener(object):
pass pass
del node.attrib['bgcolor'] del node.attrib['bgcolor']
if cssdict.get('font-weight', '').lower() == 'medium': if cssdict.get('font-weight', '').lower() == 'medium':
cssdict['font-weight'] = 'normal' # ADE chokes on font-weight medium cssdict['font-weight'] = 'normal' # ADE chokes on font-weight medium
fsize = font_size fsize = font_size
is_drop_cap = (cssdict.get('float', None) == 'left' and 'font-size' in is_drop_cap = (cssdict.get('float', None) == 'left' and 'font-size' in
cssdict and len(node) == 0 and node.text and cssdict and len(node) == 0 and node.text and
len(node.text) == 1) len(node.text) == 1)
is_drop_cap = is_drop_cap or (
# The docx input plugin generates drop caps that look like this
len(node) == 1 and not node.text and len(node[0]) == 0 and
node[0].text and not node[0].tail and len(node[0].text) == 1 and
'line-height' in cssdict and 'font-size' in cssdict)
if not self.context.disable_font_rescaling and not is_drop_cap: if not self.context.disable_font_rescaling and not is_drop_cap:
_sbase = self.sbase if self.sbase is not None else \ _sbase = self.sbase if self.sbase is not None else \
self.context.source.fbase self.context.source.fbase
@ -436,8 +446,7 @@ class CSSFlattener(object):
keep_classes = set() keep_classes = set()
if cssdict: if cssdict:
items = cssdict.items() items = sorted(cssdict.items())
items.sort()
css = u';\n'.join(u'%s: %s' % (key, val) for key, val in items) css = u';\n'.join(u'%s: %s' % (key, val) for key, val in items)
classes = node.get('class', '').strip() or 'calibre' classes = node.get('class', '').strip() or 'calibre'
klass = ascii_text(STRIPNUM.sub('', classes.split()[0].replace('_', ''))) klass = ascii_text(STRIPNUM.sub('', classes.split()[0].replace('_', '')))
@ -519,8 +528,7 @@ class CSSFlattener(object):
if float(self.context.margin_bottom) >= 0: if float(self.context.margin_bottom) >= 0:
stylizer.page_rule['margin-bottom'] = '%gpt'%\ stylizer.page_rule['margin-bottom'] = '%gpt'%\
float(self.context.margin_bottom) float(self.context.margin_bottom)
items = stylizer.page_rule.items() items = sorted(stylizer.page_rule.items())
items.sort()
css = ';\n'.join("%s: %s" % (key, val) for key, val in items) css = ';\n'.join("%s: %s" % (key, val) for key, val in items)
css = ('@page {\n%s\n}\n'%css) if items else '' css = ('@page {\n%s\n}\n'%css) if items else ''
rules = [r.cssText for r in stylizer.font_face_rules + rules = [r.cssText for r in stylizer.font_face_rules +
@ -556,14 +564,14 @@ class CSSFlattener(object):
body = html.find(XHTML('body')) body = html.find(XHTML('body'))
fsize = self.context.dest.fbase fsize = self.context.dest.fbase
self.flatten_node(body, stylizer, names, styles, pseudo_styles, fsize, item.id) self.flatten_node(body, stylizer, names, styles, pseudo_styles, fsize, item.id)
items = [(key, val) for (val, key) in styles.items()] items = sorted([(key, val) for (val, key) in styles.items()])
items.sort()
# :hover must come after link and :active must come after :hover # :hover must come after link and :active must come after :hover
psels = sorted(pseudo_styles.iterkeys(), key=lambda x : psels = sorted(pseudo_styles.iterkeys(), key=lambda x :
{'hover':1, 'active':2}.get(x, 0)) {'hover':1, 'active':2}.get(x, 0))
for psel in psels: for psel in psels:
styles = pseudo_styles[psel] styles = pseudo_styles[psel]
if not styles: continue if not styles:
continue
x = sorted(((k+':'+psel, v) for v, k in styles.iteritems())) x = sorted(((k+':'+psel, v) for v, k in styles.iteritems()))
items.extend(x) items.extend(x)

View File

@ -113,7 +113,7 @@ class Split(object):
for i, elem in enumerate(item.data.iter()): for i, elem in enumerate(item.data.iter()):
try: try:
elem.set('pb_order', str(i)) elem.set('pb_order', str(i))
except TypeError: # Cant set attributes on comment nodes etc. except TypeError: # Cant set attributes on comment nodes etc.
continue continue
page_breaks = list(page_breaks) page_breaks = list(page_breaks)
@ -159,7 +159,11 @@ class Split(object):
except ValueError: except ValueError:
# Unparseable URL # Unparseable URL
return url return url
href = urlnormalize(href) try:
href = urlnormalize(href)
except ValueError:
# href has non utf-8 quoting
return url
if href in self.map: if href in self.map:
anchor_map = self.map[href] anchor_map = self.map[href]
nhref = anchor_map[frag if frag else None] nhref = anchor_map[frag if frag else None]
@ -171,7 +175,6 @@ class Split(object):
return url return url
class FlowSplitter(object): class FlowSplitter(object):
'The actual splitting logic' 'The actual splitting logic'
@ -313,7 +316,6 @@ class FlowSplitter(object):
split_point = root.xpath(path)[0] split_point = root.xpath(path)[0]
split_point2 = root2.xpath(path)[0] split_point2 = root2.xpath(path)[0]
def nix_element(elem, top=True): def nix_element(elem, top=True):
# Remove elem unless top is False in which case replace elem by its # Remove elem unless top is False in which case replace elem by its
# children # children
@ -373,6 +375,8 @@ class FlowSplitter(object):
for img in root.xpath('//h:img', namespaces=NAMESPACES): for img in root.xpath('//h:img', namespaces=NAMESPACES):
if img.get('style', '') != 'display:none': if img.get('style', '') != 'display:none':
return False return False
if root.xpath('//*[local-name() = "svg"]'):
return False
return True return True
def split_text(self, text, root, size): def split_text(self, text, root, size):
@ -393,7 +397,6 @@ class FlowSplitter(object):
buf = part buf = part
return ans return ans
def split_to_size(self, tree): def split_to_size(self, tree):
self.log.debug('\t\tSplitting...') self.log.debug('\t\tSplitting...')
root = tree.getroot() root = tree.getroot()
@ -440,7 +443,7 @@ class FlowSplitter(object):
len(self.split_trees), size/1024.)) len(self.split_trees), size/1024.))
else: else:
self.log.debug( self.log.debug(
'\t\t\tSplit tree still too large: %d KB' % \ '\t\t\tSplit tree still too large: %d KB' %
(size/1024.)) (size/1024.))
self.split_to_size(t) self.split_to_size(t)
@ -546,7 +549,6 @@ class FlowSplitter(object):
for x in toc: for x in toc:
fix_toc_entry(x) fix_toc_entry(x)
if self.oeb.toc: if self.oeb.toc:
fix_toc_entry(self.oeb.toc) fix_toc_entry(self.oeb.toc)

View File

@ -45,11 +45,15 @@ class Links(object):
href, page, rect = link href, page, rect = link
p, frag = href.partition('#')[0::2] p, frag = href.partition('#')[0::2]
try: try:
link = ((path, p, frag or None), self.pdf.get_pageref(page).obj, Array(rect)) pref = self.pdf.get_pageref(page).obj
except IndexError: except IndexError:
self.log.warn('Unable to find page for link: %r, ignoring it' % link) try:
continue pref = self.pdf.get_pageref(page-1).obj
self.links.append(link) except IndexError:
self.pdf.debug('Unable to find page for link: %r, ignoring it' % link)
continue
self.pdf.debug('The link %s points to non-existent page, moving it one page back' % href)
self.links.append(((path, p, frag or None), pref, Array(rect)))
def add_links(self): def add_links(self):
for link in self.links: for link in self.links:

View File

@ -22,7 +22,7 @@ from calibre.gui2 import (gprefs, warning_dialog, Dispatcher, error_dialog,
from calibre.library.database2 import LibraryDatabase2 from calibre.library.database2 import LibraryDatabase2
from calibre.gui2.actions import InterfaceAction from calibre.gui2.actions import InterfaceAction
class LibraryUsageStats(object): # {{{ class LibraryUsageStats(object): # {{{
def __init__(self): def __init__(self):
self.stats = {} self.stats = {}
@ -92,7 +92,7 @@ class LibraryUsageStats(object): # {{{
self.write_stats() self.write_stats()
# }}} # }}}
class MovedDialog(QDialog): # {{{ class MovedDialog(QDialog): # {{{
def __init__(self, stats, location, parent=None): def __init__(self, stats, location, parent=None):
QDialog.__init__(self, parent) QDialog.__init__(self, parent)
@ -161,13 +161,15 @@ class ChooseLibraryAction(InterfaceAction):
def genesis(self): def genesis(self):
self.base_text = _('%d books') self.base_text = _('%d books')
self.count_changed(0) self.count_changed(0)
self.qaction.triggered.connect(self.choose_library,
type=Qt.QueuedConnection)
self.action_choose = self.menuless_qaction self.action_choose = self.menuless_qaction
self.stats = LibraryUsageStats() self.stats = LibraryUsageStats()
self.popup_type = (QToolButton.InstantPopup if len(self.stats.stats) > 1 else self.popup_type = (QToolButton.InstantPopup if len(self.stats.stats) > 1 else
QToolButton.MenuButtonPopup) QToolButton.MenuButtonPopup)
if len(self.stats.stats) > 1:
self.action_choose.triggered.connect(self.choose_library)
else:
self.qaction.triggered.connect(self.choose_library)
self.choose_menu = self.qaction.menu() self.choose_menu = self.qaction.menu()
@ -200,7 +202,6 @@ class ChooseLibraryAction(InterfaceAction):
type=Qt.QueuedConnection) type=Qt.QueuedConnection)
self.choose_menu.addAction(ac) self.choose_menu.addAction(ac)
self.rename_separator = self.choose_menu.addSeparator() self.rename_separator = self.choose_menu.addSeparator()
self.maintenance_menu = QMenu(_('Library Maintenance')) self.maintenance_menu = QMenu(_('Library Maintenance'))
@ -477,19 +478,20 @@ class ChooseLibraryAction(InterfaceAction):
else: else:
return return
#from calibre.utils.mem import memory # from calibre.utils.mem import memory
#import weakref # import weakref
#from PyQt4.Qt import QTimer # from PyQt4.Qt import QTimer
#self.dbref = weakref.ref(self.gui.library_view.model().db) # self.dbref = weakref.ref(self.gui.library_view.model().db)
#self.before_mem = memory()/1024**2 # self.before_mem = memory()/1024**2
self.gui.library_moved(loc, allow_rebuild=True) self.gui.library_moved(loc, allow_rebuild=True)
#QTimer.singleShot(5000, self.debug_leak) # QTimer.singleShot(5000, self.debug_leak)
def debug_leak(self): def debug_leak(self):
import gc import gc
from calibre.utils.mem import memory from calibre.utils.mem import memory
ref = self.dbref ref = self.dbref
for i in xrange(3): gc.collect() for i in xrange(3):
gc.collect()
if ref() is not None: if ref() is not None:
print 'DB object alive:', ref() print 'DB object alive:', ref()
for r in gc.get_referrers(ref())[:10]: for r in gc.get_referrers(ref())[:10]:
@ -500,7 +502,6 @@ class ChooseLibraryAction(InterfaceAction):
print print
self.dbref = self.before_mem = None self.dbref = self.before_mem = None
def qs_requested(self, idx, *args): def qs_requested(self, idx, *args):
self.switch_requested(self.qs_locations[idx]) self.switch_requested(self.qs_locations[idx])
@ -546,3 +547,4 @@ class ChooseLibraryAction(InterfaceAction):
return False return False
return True return True

View File

@ -38,6 +38,13 @@ class ShowQuickviewAction(InterfaceAction):
Quickview(self.gui, self.gui.library_view, index) Quickview(self.gui, self.gui.library_view, index)
self.current_instance.show() self.current_instance.show()
def change_quickview_column(self, idx):
self.show_quickview()
if self.current_instance:
if self.current_instance.is_closed:
return
self.current_instance.change_quickview_column.emit(idx)
def library_changed(self, db): def library_changed(self, db):
if self.current_instance and not self.current_instance.is_closed: if self.current_instance and not self.current_instance.is_closed:
self.current_instance.set_database(db) self.current_instance.set_database(db)

View File

@ -7,10 +7,10 @@ __docformat__ = 'restructuredtext en'
from functools import partial from functools import partial
from PyQt4.Qt import QComboBox, QLabel, QSpinBox, QDoubleSpinBox, QDateTimeEdit, \ from PyQt4.Qt import (QComboBox, QLabel, QSpinBox, QDoubleSpinBox, QDateTimeEdit,
QDateTime, QGroupBox, QVBoxLayout, QSizePolicy, QGridLayout, \ QDateTime, QGroupBox, QVBoxLayout, QSizePolicy, QGridLayout,
QSpacerItem, QIcon, QCheckBox, QWidget, QHBoxLayout, SIGNAL, \ QSpacerItem, QIcon, QCheckBox, QWidget, QHBoxLayout, SIGNAL,
QPushButton, QMessageBox, QToolButton QPushButton, QMessageBox, QToolButton, Qt)
from calibre.utils.date import qt_to_dt, now from calibre.utils.date import qt_to_dt, now
from calibre.gui2.complete2 import EditWithComplete from calibre.gui2.complete2 import EditWithComplete
@ -39,7 +39,6 @@ class Base(object):
def gui_val(self): def gui_val(self):
return self.getter() return self.getter()
def commit(self, book_id, notify=False): def commit(self, book_id, notify=False):
val = self.gui_val val = self.gui_val
val = self.normalize_ui_val(val) val = self.normalize_ui_val(val)
@ -159,6 +158,17 @@ class DateTimeEdit(QDateTimeEdit):
def set_to_clear(self): def set_to_clear(self):
self.setDateTime(UNDEFINED_QDATETIME) self.setDateTime(UNDEFINED_QDATETIME)
def keyPressEvent(self, ev):
if ev.key() == Qt.Key_Minus:
ev.accept()
self.setDateTime(self.minimumDateTime())
elif ev.key() == Qt.Key_Equal:
ev.accept()
self.setDateTime(QDateTime.currentDateTime())
else:
return QDateTimeEdit.keyPressEvent(self, ev)
class DateTime(Base): class DateTime(Base):
def setup_ui(self, parent): def setup_ui(self, parent):
@ -211,7 +221,7 @@ class Comments(Base):
self._layout = QVBoxLayout() self._layout = QVBoxLayout()
self._tb = CommentsEditor(self._box) self._tb = CommentsEditor(self._box)
self._tb.setSizePolicy(QSizePolicy.Expanding, QSizePolicy.Minimum) self._tb.setSizePolicy(QSizePolicy.Expanding, QSizePolicy.Minimum)
#self._tb.setTabChangesFocus(True) # self._tb.setTabChangesFocus(True)
self._layout.addWidget(self._tb) self._layout.addWidget(self._tb)
self._box.setLayout(self._layout) self._box.setLayout(self._layout)
self.widgets = [self._box] self.widgets = [self._box]
@ -534,7 +544,7 @@ def populate_metadata_page(layout, db, book_id, bulk=False, two_column=False, pa
column = row = base_row = max_row = 0 column = row = base_row = max_row = 0
for key in cols: for key in cols:
if not fm[key]['is_editable']: if not fm[key]['is_editable']:
continue # this almost never happens continue # this almost never happens
dt = fm[key]['datatype'] dt = fm[key]['datatype']
if dt == 'composite' or (bulk and dt == 'comments'): if dt == 'composite' or (bulk and dt == 'comments'):
continue continue
@ -595,7 +605,6 @@ class BulkBase(Base):
self._cached_gui_val_ = self.getter() self._cached_gui_val_ = self.getter()
return self._cached_gui_val_ return self._cached_gui_val_
def get_initial_value(self, book_ids): def get_initial_value(self, book_ids):
values = set([]) values = set([])
for book_id in book_ids: for book_id in book_ids:
@ -633,7 +642,7 @@ class BulkBase(Base):
self.main_widget = main_widget_class(w) self.main_widget = main_widget_class(w)
l.addWidget(self.main_widget) l.addWidget(self.main_widget)
l.setStretchFactor(self.main_widget, 10) l.setStretchFactor(self.main_widget, 10)
self.a_c_checkbox = QCheckBox( _('Apply changes'), w) self.a_c_checkbox = QCheckBox(_('Apply changes'), w)
l.addWidget(self.a_c_checkbox) l.addWidget(self.a_c_checkbox)
self.ignore_change_signals = True self.ignore_change_signals = True
@ -1054,3 +1063,5 @@ bulk_widgets = {
'series': BulkSeries, 'series': BulkSeries,
'enumeration': BulkEnumeration, 'enumeration': BulkEnumeration,
} }

View File

@ -122,7 +122,8 @@ def device_name_for_plugboards(device_class):
class DeviceManager(Thread): # {{{ class DeviceManager(Thread): # {{{
def __init__(self, connected_slot, job_manager, open_feedback_slot, def __init__(self, connected_slot, job_manager, open_feedback_slot,
open_feedback_msg, allow_connect_slot, sleep_time=2): open_feedback_msg, allow_connect_slot,
after_callback_feedback_slot, sleep_time=2):
''' '''
:sleep_time: Time to sleep between device probes in secs :sleep_time: Time to sleep between device probes in secs
''' '''
@ -150,6 +151,7 @@ class DeviceManager(Thread): # {{{
self.ejected_devices = set([]) self.ejected_devices = set([])
self.mount_connection_requests = Queue.Queue(0) self.mount_connection_requests = Queue.Queue(0)
self.open_feedback_slot = open_feedback_slot self.open_feedback_slot = open_feedback_slot
self.after_callback_feedback_slot = after_callback_feedback_slot
self.open_feedback_msg = open_feedback_msg self.open_feedback_msg = open_feedback_msg
self._device_information = None self._device_information = None
self.current_library_uuid = None self.current_library_uuid = None
@ -392,6 +394,10 @@ class DeviceManager(Thread): # {{{
self.device.set_progress_reporter(job.report_progress) self.device.set_progress_reporter(job.report_progress)
self.current_job.run() self.current_job.run()
self.current_job = None self.current_job = None
feedback = getattr(self.device, 'user_feedback_after_callback', None)
if feedback is not None:
self.device.user_feedback_after_callback = None
self.after_callback_feedback_slot(feedback)
else: else:
break break
if do_sleep: if do_sleep:
@ -850,7 +856,7 @@ class DeviceMixin(object): # {{{
self.device_manager = DeviceManager(FunctionDispatcher(self.device_detected), self.device_manager = DeviceManager(FunctionDispatcher(self.device_detected),
self.job_manager, Dispatcher(self.status_bar.show_message), self.job_manager, Dispatcher(self.status_bar.show_message),
Dispatcher(self.show_open_feedback), Dispatcher(self.show_open_feedback),
FunctionDispatcher(self.allow_connect)) FunctionDispatcher(self.allow_connect), Dispatcher(self.after_callback_feedback))
self.device_manager.start() self.device_manager.start()
self.device_manager.devices_initialized.wait() self.device_manager.devices_initialized.wait()
if tweaks['auto_connect_to_folder']: if tweaks['auto_connect_to_folder']:
@ -862,6 +868,10 @@ class DeviceMixin(object): # {{{
name, show_copy_button=False, name, show_copy_button=False,
override_icon=QIcon(icon)) override_icon=QIcon(icon))
def after_callback_feedback(self, feedback):
title, msg, det_msg = feedback
info_dialog(self, feedback['title'], feedback['msg'], det_msg=feedback['det_msg']).show()
def debug_detection(self, done): def debug_detection(self, done):
self.debug_detection_callback = weakref.ref(done) self.debug_detection_callback = weakref.ref(done)
self.device_manager.debug_detection(FunctionDispatcher(self.debug_detection_done)) self.device_manager.debug_detection(FunctionDispatcher(self.debug_detection_done))
@ -1116,7 +1126,7 @@ class DeviceMixin(object): # {{{
return return
dm = self.iactions['Remove Books'].delete_memory dm = self.iactions['Remove Books'].delete_memory
if dm.has_key(job): if job in dm:
paths, model = dm.pop(job) paths, model = dm.pop(job)
self.device_manager.remove_books_from_metadata(paths, self.device_manager.remove_books_from_metadata(paths,
self.booklists()) self.booklists())
@ -1141,7 +1151,7 @@ class DeviceMixin(object): # {{{
def dispatch_sync_event(self, dest, delete, specific): def dispatch_sync_event(self, dest, delete, specific):
rows = self.library_view.selectionModel().selectedRows() rows = self.library_view.selectionModel().selectedRows()
if not rows or len(rows) == 0: if not rows or len(rows) == 0:
error_dialog(self, _('No books'), _('No books')+' '+\ error_dialog(self, _('No books'), _('No books')+' '+
_('selected to send')).exec_() _('selected to send')).exec_()
return return
@ -1160,7 +1170,7 @@ class DeviceMixin(object): # {{{
if fmts: if fmts:
for f in fmts.split(','): for f in fmts.split(','):
f = f.lower() f = f.lower()
if format_count.has_key(f): if f in format_count:
format_count[f] += 1 format_count[f] += 1
else: else:
format_count[f] = 1 format_count[f] = 1

View File

@ -28,7 +28,10 @@ class ConfigWidget(QWidget, Ui_ConfigWidget):
all_formats = set(all_formats) all_formats = set(all_formats)
self.calibre_known_formats = device.FORMATS self.calibre_known_formats = device.FORMATS
self.device_name = device.get_gui_name() try:
self.device_name = device.get_gui_name()
except TypeError:
self.device_name = getattr(device, 'gui_name', None) or _('Device')
if device.USER_CAN_ADD_NEW_FORMATS: if device.USER_CAN_ADD_NEW_FORMATS:
all_formats = set(all_formats) | set(BOOK_EXTENSIONS) all_formats = set(all_formats) | set(BOOK_EXTENSIONS)

View File

@ -6,7 +6,7 @@ __docformat__ = 'restructuredtext en'
from PyQt4.Qt import (Qt, QDialog, QAbstractItemView, QTableWidgetItem, from PyQt4.Qt import (Qt, QDialog, QAbstractItemView, QTableWidgetItem,
QListWidgetItem, QByteArray, QCoreApplication, QListWidgetItem, QByteArray, QCoreApplication,
QApplication) QApplication, pyqtSignal)
from calibre.customize.ui import find_plugin from calibre.customize.ui import find_plugin
from calibre.gui2 import gprefs from calibre.gui2 import gprefs
@ -44,6 +44,8 @@ class TableItem(QTableWidgetItem):
class Quickview(QDialog, Ui_Quickview): class Quickview(QDialog, Ui_Quickview):
change_quickview_column = pyqtSignal(object)
def __init__(self, gui, view, row): def __init__(self, gui, view, row):
QDialog.__init__(self, gui, flags=Qt.Window) QDialog.__init__(self, gui, flags=Qt.Window)
Ui_Quickview.__init__(self) Ui_Quickview.__init__(self)
@ -105,6 +107,7 @@ class Quickview(QDialog, Ui_Quickview):
self.refresh(row) self.refresh(row)
self.view.clicked.connect(self.slave) self.view.clicked.connect(self.slave)
self.change_quickview_column.connect(self.slave)
QCoreApplication.instance().aboutToQuit.connect(self.save_state) QCoreApplication.instance().aboutToQuit.connect(self.save_state)
self.search_button.clicked.connect(self.do_search) self.search_button.clicked.connect(self.do_search)
view.model().new_bookdisplay_data.connect(self.book_was_changed) view.model().new_bookdisplay_data.connect(self.book_was_changed)
@ -146,6 +149,9 @@ class Quickview(QDialog, Ui_Quickview):
key = self.view.model().column_map[self.current_column] key = self.view.model().column_map[self.current_column]
book_id = self.view.model().id(bv_row) book_id = self.view.model().id(bv_row)
if self.current_book_id == book_id and self.current_key == key:
return
# Only show items for categories # Only show items for categories
if not self.db.field_metadata[key]['is_category']: if not self.db.field_metadata[key]['is_category']:
if self.current_key is None: if self.current_key is None:
@ -164,6 +170,8 @@ class Quickview(QDialog, Ui_Quickview):
if vals: if vals:
self.no_valid_items = False self.no_valid_items = False
if self.db.field_metadata[key]['datatype'] == 'rating':
vals = unicode(vals/2)
if not isinstance(vals, list): if not isinstance(vals, list):
vals = [vals] vals = [vals]
vals.sort(key=sort_key) vals.sort(key=sort_key)
@ -198,8 +206,7 @@ class Quickview(QDialog, Ui_Quickview):
sv = selected_item sv = selected_item
sv = sv.replace('"', r'\"') sv = sv.replace('"', r'\"')
self.last_search = self.current_key+':"=' + sv + '"' self.last_search = self.current_key+':"=' + sv + '"'
books = self.db.search_getting_ids(self.last_search, books = self.db.search(self.last_search, return_matches=True)
self.db.data.search_restriction)
self.books_table.setRowCount(len(books)) self.books_table.setRowCount(len(books))
self.books_label.setText(_('Books with selected item "{0}": {1}'). self.books_label.setText(_('Books with selected item "{0}": {1}').

View File

@ -3,17 +3,21 @@ __copyright__ = '2008, Kovid Goyal kovid@kovidgoyal.net'
__docformat__ = 'restructuredtext en' __docformat__ = 'restructuredtext en'
__license__ = 'GPL v3' __license__ = 'GPL v3'
import json import json, os, traceback
from PyQt4.Qt import (Qt, QDialog, QDialogButtonBox, QSyntaxHighlighter, QFont, from PyQt4.Qt import (Qt, QDialog, QDialogButtonBox, QSyntaxHighlighter, QFont,
QRegExp, QApplication, QTextCharFormat, QColor, QCursor) QRegExp, QApplication, QTextCharFormat, QColor, QCursor,
QIcon, QSize)
from calibre.gui2 import error_dialog from calibre import sanitize_file_name_unicode
from calibre.constants import config_dir
from calibre.gui2.dialogs.template_dialog_ui import Ui_TemplateDialog from calibre.gui2.dialogs.template_dialog_ui import Ui_TemplateDialog
from calibre.utils.formatter_functions import formatter_functions from calibre.utils.formatter_functions import formatter_functions
from calibre.utils.icu import sort_key
from calibre.ebooks.metadata.book.base import Metadata from calibre.ebooks.metadata.book.base import Metadata
from calibre.ebooks.metadata.book.formatter import SafeFormat from calibre.ebooks.metadata.book.formatter import SafeFormat
from calibre.library.coloring import (displayable_columns) from calibre.library.coloring import (displayable_columns, color_row_key)
from calibre.gui2 import error_dialog, choose_files, pixmap_to_data
class ParenPosition: class ParenPosition:
@ -198,25 +202,56 @@ class TemplateHighlighter(QSyntaxHighlighter):
class TemplateDialog(QDialog, Ui_TemplateDialog): class TemplateDialog(QDialog, Ui_TemplateDialog):
def __init__(self, parent, text, mi=None, fm=None, color_field=None): def __init__(self, parent, text, mi=None, fm=None, color_field=None,
icon_field_key=None, icon_rule_kind=None):
QDialog.__init__(self, parent) QDialog.__init__(self, parent)
Ui_TemplateDialog.__init__(self) Ui_TemplateDialog.__init__(self)
self.setupUi(self) self.setupUi(self)
self.coloring = color_field is not None self.coloring = color_field is not None
self.iconing = icon_field_key is not None
cols = []
if fm is not None:
for key in sorted(displayable_columns(fm),
key=lambda(k): sort_key(fm[k]['name']) if k != color_row_key else 0):
if key == color_row_key and not self.coloring:
continue
from calibre.gui2.preferences.coloring import all_columns_string
name = all_columns_string if key == color_row_key else fm[key]['name']
if name:
cols.append((name, key))
self.color_layout.setVisible(False)
self.icon_layout.setVisible(False)
if self.coloring: if self.coloring:
cols = sorted([k for k in displayable_columns(fm)]) self.color_layout.setVisible(True)
self.colored_field.addItems(cols) for n1, k1 in cols:
self.colored_field.setCurrentIndex(self.colored_field.findText(color_field)) self.colored_field.addItem(n1, k1)
self.colored_field.setCurrentIndex(self.colored_field.findData(color_field))
colors = QColor.colorNames() colors = QColor.colorNames()
colors.sort() colors.sort()
self.color_name.addItems(colors) self.color_name.addItems(colors)
else: elif self.iconing:
self.colored_field.setVisible(False) self.icon_layout.setVisible(True)
self.colored_field_label.setVisible(False) for n1, k1 in cols:
self.color_chooser_label.setVisible(False) self.icon_field.addItem(n1, k1)
self.color_name.setVisible(False) self.icon_file_names = []
self.color_copy_button.setVisible(False) d = os.path.join(config_dir, 'cc_icons')
if os.path.exists(d):
for icon_file in os.listdir(d):
icon_file = icu_lower(icon_file)
if os.path.exists(os.path.join(d, icon_file)):
if icon_file.endswith('.png'):
self.icon_file_names.append(icon_file)
self.icon_file_names.sort(key=sort_key)
self.update_filename_box()
self.icon_with_text.setChecked(True)
if icon_rule_kind == 'icon_only':
self.icon_without_text.setChecked(True)
self.icon_field.setCurrentIndex(self.icon_field.findData(icon_field_key))
if mi: if mi:
self.mi = mi self.mi = mi
else: else:
@ -248,6 +283,8 @@ class TemplateDialog(QDialog, Ui_TemplateDialog):
self.buttonBox.button(QDialogButtonBox.Ok).setText(_('&OK')) self.buttonBox.button(QDialogButtonBox.Ok).setText(_('&OK'))
self.buttonBox.button(QDialogButtonBox.Cancel).setText(_('&Cancel')) self.buttonBox.button(QDialogButtonBox.Cancel).setText(_('&Cancel'))
self.color_copy_button.clicked.connect(self.color_to_clipboard) self.color_copy_button.clicked.connect(self.color_to_clipboard)
self.filename_button.clicked.connect(self.filename_button_clicked)
self.icon_copy_button.clicked.connect(self.icon_to_clipboard)
try: try:
with open(P('template-functions.json'), 'rb') as f: with open(P('template-functions.json'), 'rb') as f:
@ -276,11 +313,55 @@ class TemplateDialog(QDialog, Ui_TemplateDialog):
'<a href="http://manual.calibre-ebook.com/template_ref.html">' '<a href="http://manual.calibre-ebook.com/template_ref.html">'
'%s</a>'%tt) '%s</a>'%tt)
def filename_button_clicked(self):
try:
path = choose_files(self, 'choose_category_icon',
_('Select Icon'), filters=[
('Images', ['png', 'gif', 'jpg', 'jpeg'])],
all_files=False, select_only_single_file=True)
if path:
icon_path = path[0]
icon_name = sanitize_file_name_unicode(
os.path.splitext(
os.path.basename(icon_path))[0]+'.png')
if icon_name not in self.icon_file_names:
self.icon_file_names.append(icon_name)
self.update_filename_box()
try:
p = QIcon(icon_path).pixmap(QSize(128, 128))
d = os.path.join(config_dir, 'cc_icons')
if not os.path.exists(os.path.join(d, icon_name)):
if not os.path.exists(d):
os.makedirs(d)
with open(os.path.join(d, icon_name), 'wb') as f:
f.write(pixmap_to_data(p, format='PNG'))
except:
traceback.print_exc()
self.icon_files.setCurrentIndex(self.icon_files.findText(icon_name))
self.icon_files.adjustSize()
except:
traceback.print_exc()
return
def update_filename_box(self):
self.icon_files.clear()
self.icon_file_names.sort(key=sort_key)
self.icon_files.addItem('')
self.icon_files.addItems(self.icon_file_names)
for i,filename in enumerate(self.icon_file_names):
icon = QIcon(os.path.join(config_dir, 'cc_icons', filename))
self.icon_files.setItemIcon(i+1, icon)
def color_to_clipboard(self): def color_to_clipboard(self):
app = QApplication.instance() app = QApplication.instance()
c = app.clipboard() c = app.clipboard()
c.setText(unicode(self.color_name.currentText())) c.setText(unicode(self.color_name.currentText()))
def icon_to_clipboard(self):
app = QApplication.instance()
c = app.clipboard()
c.setText(unicode(self.icon_files.currentText()))
def textbox_changed(self): def textbox_changed(self):
cur_text = unicode(self.textbox.toPlainText()) cur_text = unicode(self.textbox.toPlainText())
if self.last_text != cur_text: if self.last_text != cur_text:
@ -324,5 +405,14 @@ class TemplateDialog(QDialog, Ui_TemplateDialog):
_('The template box cannot be empty'), show=True) _('The template box cannot be empty'), show=True)
return return
self.rule = (unicode(self.colored_field.currentText()), txt) self.rule = (unicode(self.colored_field.itemData(
self.colored_field.currentIndex()).toString()), txt)
elif self.iconing:
rt = 'icon' if self.icon_with_text.isChecked() else 'icon_only'
self.rule = (rt,
unicode(self.icon_field.itemData(
self.icon_field.currentIndex()).toString()),
txt)
else:
self.rule = ('', txt)
QDialog.accept(self) QDialog.accept(self)

View File

@ -21,47 +21,139 @@
</property> </property>
<layout class="QVBoxLayout" name="verticalLayout"> <layout class="QVBoxLayout" name="verticalLayout">
<item> <item>
<layout class="QGridLayout"> <widget class="QWidget" name="color_layout">
<item row="0" column="0"> <layout class="QGridLayout">
<widget class="QLabel" name="colored_field_label"> <item row="0" column="0">
<property name="text"> <widget class="QLabel" name="colored_field_label">
<string>Set the color of the column:</string> <property name="text">
</property> <string>Set the color of the column:</string>
<property name="buddy"> </property>
<cstring>colored_field</cstring> <property name="buddy">
</property> <cstring>colored_field</cstring>
</widget> </property>
</item> </widget>
<item row="0" column="1"> </item>
<widget class="QComboBox" name="colored_field"> <item row="0" column="1">
</widget> <widget class="QComboBox" name="colored_field">
</item> </widget>
<item row="1" column="0"> </item>
<widget class="QLabel" name="color_chooser_label"> <item row="1" column="0">
<property name="text"> <widget class="QLabel" name="color_chooser_label">
<string>Copy a color name to the clipboard:</string> <property name="text">
</property> <string>Copy a color name to the clipboard:</string>
<property name="buddy"> </property>
<cstring>color_name</cstring> <property name="buddy">
</property> <cstring>color_name</cstring>
</widget> </property>
</item> </widget>
<item row="1" column="1"> </item>
<widget class="QComboBox" name="color_name"> <item row="1" column="1">
</widget> <widget class="QComboBox" name="color_name">
</item> </widget>
<item row="1" column="2"> </item>
<widget class="QToolButton" name="color_copy_button"> <item row="1" column="2">
<property name="icon"> <widget class="QToolButton" name="color_copy_button">
<iconset resource="../../../../resources/images.qrc"> <property name="icon">
<normaloff>:/images/edit-copy.png</normaloff>:/images/edit-copy.png</iconset> <iconset resource="../../../../resources/images.qrc">
</property> <normaloff>:/images/edit-copy.png</normaloff>:/images/edit-copy.png</iconset>
<property name="toolTip"> </property>
<string>Copy the selected color name to the clipboard</string> <property name="toolTip">
</property> <string>Copy the selected color name to the clipboard</string>
</widget> </property>
</item> </widget>
</layout> </item>
</layout>
</widget>
</item>
<item>
<widget class="QWidget" name="icon_layout">
<layout class="QGridLayout">
<item row="0" column="0" colspan="2">
<widget class="QGroupBox">
<property name="title">
<string>Kind</string>
</property>
<layout class="QHBoxLayout">
<item>
<widget class="QRadioButton" name="icon_without_text">
<property name="text">
<string>icon with no text</string>
</property>
</widget>
</item>
<item>
<widget class="QRadioButton" name="icon_with_text">
<property name="text">
<string>icon with text</string>
</property>
</widget>
</item>
</layout>
<property name="sizePolicy">
<sizepolicy hsizetype="Expanding" vsizetype="Fixed">
<horstretch>100</horstretch>
<verstretch>0</verstretch>
</sizepolicy>
</property>
</widget>
</item>
<item row="1" column="0">
<widget class="QLabel" name="icon_chooser_label">
<property name="text">
<string>Apply the icon to column:</string>
</property>
<property name="buddy">
<cstring>icon_field</cstring>
</property>
</widget>
</item>
<item row="1" column="1">
<widget class="QComboBox" name="icon_field">
</widget>
</item>
<item row="2" column="0">
<widget class="QLabel" name="image_chooser_label">
<property name="text">
<string>Copy an icon file name to the clipboard:</string>
</property>
<property name="buddy">
<cstring>color_name</cstring>
</property>
</widget>
</item>
<item row="2" column="1">
<widget class="QWidget">
<layout class="QHBoxLayout">
<item>
<widget class="QComboBox" name="icon_files">
</widget>
</item>
<item>
<widget class="QToolButton" name="icon_copy_button">
<property name="icon">
<iconset resource="../../../../resources/images.qrc">
<normaloff>:/images/edit-copy.png</normaloff>:/images/edit-copy.png</iconset>
</property>
<property name="toolTip">
<string>Copy the selected icon file name to the clipboard</string>
</property>
</widget>
</item>
<item>
<widget class="QPushButton" name="filename_button">
<property name="text">
<string>Add icon</string>
</property>
<property name="toolTip">
<string>Add an icon file to the set of choices</string>
</property>
</widget>
</item>
</layout>
</widget>
</item>
</layout>
</widget>
</item> </item>
<item> <item>
<widget class="QPlainTextEdit" name="textbox"/> <widget class="QPlainTextEdit" name="textbox"/>

View File

@ -27,7 +27,7 @@ def partial(*args, **kwargs):
_keep_refs.append(ans) _keep_refs.append(ans)
return ans return ans
class LibraryViewMixin(object): # {{{ class LibraryViewMixin(object): # {{{
def __init__(self, db): def __init__(self, db):
self.library_view.files_dropped.connect(self.iactions['Add Books'].files_dropped, type=Qt.QueuedConnection) self.library_view.files_dropped.connect(self.iactions['Add Books'].files_dropped, type=Qt.QueuedConnection)
@ -100,7 +100,7 @@ class LibraryViewMixin(object): # {{{
# }}} # }}}
class LibraryWidget(Splitter): # {{{ class LibraryWidget(Splitter): # {{{
def __init__(self, parent): def __init__(self, parent):
orientation = Qt.Vertical orientation = Qt.Vertical
@ -119,7 +119,7 @@ class LibraryWidget(Splitter): # {{{
self.addWidget(parent.library_view) self.addWidget(parent.library_view)
# }}} # }}}
class Stack(QStackedWidget): # {{{ class Stack(QStackedWidget): # {{{
def __init__(self, parent): def __init__(self, parent):
QStackedWidget.__init__(self, parent) QStackedWidget.__init__(self, parent)
@ -147,7 +147,7 @@ class Stack(QStackedWidget): # {{{
# }}} # }}}
class UpdateLabel(QLabel): # {{{ class UpdateLabel(QLabel): # {{{
def __init__(self, *args, **kwargs): def __init__(self, *args, **kwargs):
QLabel.__init__(self, *args, **kwargs) QLabel.__init__(self, *args, **kwargs)
@ -157,22 +157,22 @@ class UpdateLabel(QLabel): # {{{
pass pass
# }}} # }}}
class StatusBar(QStatusBar): # {{{ class StatusBar(QStatusBar): # {{{
def __init__(self, parent=None): def __init__(self, parent=None):
QStatusBar.__init__(self, parent) QStatusBar.__init__(self, parent)
self.default_message = __appname__ + ' ' + _('version') + ' ' + \
self.get_version() + ' ' + _('created by Kovid Goyal')
self.device_string = '' self.device_string = ''
self.update_label = UpdateLabel('') self.update_label = UpdateLabel('')
self.total = self.current = self.selected = 0
self.addPermanentWidget(self.update_label) self.addPermanentWidget(self.update_label)
self.update_label.setVisible(False) self.update_label.setVisible(False)
self._font = QFont() self._font = QFont()
self._font.setBold(True) self._font.setBold(True)
self.setFont(self._font) self.setFont(self._font)
self.defmsg = QLabel(self.default_message) self.defmsg = QLabel('')
self.defmsg.setFont(self._font) self.defmsg.setFont(self._font)
self.addWidget(self.defmsg) self.addWidget(self.defmsg)
self.set_label()
def initialize(self, systray=None): def initialize(self, systray=None):
self.systray = systray self.systray = systray
@ -180,17 +180,39 @@ class StatusBar(QStatusBar): # {{{
def device_connected(self, devname): def device_connected(self, devname):
self.device_string = _('Connected ') + devname self.device_string = _('Connected ') + devname
self.defmsg.setText(self.default_message + ' ..::.. ' + self.set_label()
self.device_string)
def update_state(self, total, current, selected):
self.total, self.current, self.selected = total, current, selected
self.set_label()
def set_label(self):
try:
self._set_label()
except:
import traceback
traceback.print_exc()
def _set_label(self):
msg = '%s %s %s' % (__appname__, _('version'), get_version())
if self.device_string:
msg += ' ..::.. ' + self.device_string
else:
msg += _(' %(created)s %(name)s') % dict(created=_('created by'), name='Kovid Goyal')
if self.total != self.current:
base = _('%(num)d of %(total)d books') % dict(num=self.current, total=self.total)
else:
base = _('%d books') % self.total
if self.selected > 0:
base = _('%(num)s, %(sel)d selected') % dict(num=base, sel=self.selected)
self.defmsg.setText('%s [%s]' % (msg, base))
self.clearMessage() self.clearMessage()
def device_disconnected(self): def device_disconnected(self):
self.device_string = '' self.device_string = ''
self.defmsg.setText(self.default_message) self.set_label()
self.clearMessage()
def get_version(self):
return get_version()
def show_message(self, msg, timeout=0): def show_message(self, msg, timeout=0):
self.showMessage(msg, timeout) self.showMessage(msg, timeout)
@ -207,11 +229,11 @@ class StatusBar(QStatusBar): # {{{
# }}} # }}}
class LayoutMixin(object): # {{{ class LayoutMixin(object): # {{{
def __init__(self): def __init__(self):
if config['gui_layout'] == 'narrow': # narrow {{{ if config['gui_layout'] == 'narrow': # narrow {{{
self.book_details = BookDetails(False, self) self.book_details = BookDetails(False, self)
self.stack = Stack(self) self.stack = Stack(self)
self.bd_splitter = Splitter('book_details_splitter', self.bd_splitter = Splitter('book_details_splitter',
@ -224,7 +246,7 @@ class LayoutMixin(object): # {{{
self.centralwidget.layout().addWidget(self.bd_splitter) self.centralwidget.layout().addWidget(self.bd_splitter)
button_order = ('tb', 'bd', 'cb') button_order = ('tb', 'bd', 'cb')
# }}} # }}}
else: # wide {{{ else: # wide {{{
self.bd_splitter = Splitter('book_details_splitter', self.bd_splitter = Splitter('book_details_splitter',
_('Book Details'), I('book.png'), initial_side_size=200, _('Book Details'), I('book.png'), initial_side_size=200,
orientation=Qt.Horizontal, parent=self, side_index=1, orientation=Qt.Horizontal, parent=self, side_index=1,
@ -312,9 +334,15 @@ class LayoutMixin(object): # {{{
def read_layout_settings(self): def read_layout_settings(self):
# View states are restored automatically when set_database is called # View states are restored automatically when set_database is called
for x in ('cb', 'tb', 'bd'): for x in ('cb', 'tb', 'bd'):
getattr(self, x+'_splitter').restore_state() getattr(self, x+'_splitter').restore_state()
def update_status_bar(self, *args):
v = self.current_view()
selected = len(v.selectionModel().selectedRows())
total, current = v.model().counts()
self.status_bar.update_state(total, current, selected)
# }}} # }}}

View File

@ -9,7 +9,7 @@ import sys
from PyQt4.Qt import (Qt, QApplication, QStyle, QIcon, QDoubleSpinBox, from PyQt4.Qt import (Qt, QApplication, QStyle, QIcon, QDoubleSpinBox,
QVariant, QSpinBox, QStyledItemDelegate, QComboBox, QTextDocument, QVariant, QSpinBox, QStyledItemDelegate, QComboBox, QTextDocument,
QAbstractTextDocumentLayout, QFont, QFontInfo, QDate) QAbstractTextDocumentLayout, QFont, QFontInfo, QDate, QDateTimeEdit, QDateTime)
from calibre.gui2 import UNDEFINED_QDATETIME, error_dialog, rating_font from calibre.gui2 import UNDEFINED_QDATETIME, error_dialog, rating_font
from calibre.constants import iswindows from calibre.constants import iswindows
@ -23,8 +23,28 @@ from calibre.gui2.dialogs.comments_dialog import CommentsDialog
from calibre.gui2.dialogs.template_dialog import TemplateDialog from calibre.gui2.dialogs.template_dialog import TemplateDialog
from calibre.gui2.languages import LanguagesEdit from calibre.gui2.languages import LanguagesEdit
class DateTimeEdit(QDateTimeEdit): # {{{
class RatingDelegate(QStyledItemDelegate): # {{{ def __init__(self, parent, format):
QDateTimeEdit.__init__(self, parent)
self.setFrame(False)
self.setMinimumDateTime(UNDEFINED_QDATETIME)
self.setSpecialValueText(_('Undefined'))
self.setCalendarPopup(True)
self.setDisplayFormat(format)
def keyPressEvent(self, ev):
if ev.key() == Qt.Key_Minus:
ev.accept()
self.setDateTime(self.minimumDateTime())
elif ev.key() == Qt.Key_Equal:
ev.accept()
self.setDateTime(QDateTime.currentDateTime())
else:
return QDateTimeEdit.keyPressEvent(self, ev)
# }}}
class RatingDelegate(QStyledItemDelegate): # {{{
def __init__(self, *args, **kwargs): def __init__(self, *args, **kwargs):
QStyledItemDelegate.__init__(self, *args, **kwargs) QStyledItemDelegate.__init__(self, *args, **kwargs)
@ -60,7 +80,7 @@ class RatingDelegate(QStyledItemDelegate): # {{{
# }}} # }}}
class DateDelegate(QStyledItemDelegate): # {{{ class DateDelegate(QStyledItemDelegate): # {{{
def __init__(self, parent, tweak_name='gui_timestamp_display_format', def __init__(self, parent, tweak_name='gui_timestamp_display_format',
default_format='dd MMM yyyy'): default_format='dd MMM yyyy'):
@ -77,16 +97,11 @@ class DateDelegate(QStyledItemDelegate): # {{{
return format_date(qt_to_dt(d, as_utc=False), self.format) return format_date(qt_to_dt(d, as_utc=False), self.format)
def createEditor(self, parent, option, index): def createEditor(self, parent, option, index):
qde = QStyledItemDelegate.createEditor(self, parent, option, index) return DateTimeEdit(parent, self.format)
qde.setDisplayFormat(self.format)
qde.setMinimumDateTime(UNDEFINED_QDATETIME)
qde.setSpecialValueText(_('Undefined'))
qde.setCalendarPopup(True)
return qde
# }}} # }}}
class PubDateDelegate(QStyledItemDelegate): # {{{ class PubDateDelegate(QStyledItemDelegate): # {{{
def __init__(self, *args, **kwargs): def __init__(self, *args, **kwargs):
QStyledItemDelegate.__init__(self, *args, **kwargs) QStyledItemDelegate.__init__(self, *args, **kwargs)
@ -101,12 +116,7 @@ class PubDateDelegate(QStyledItemDelegate): # {{{
return format_date(qt_to_dt(d, as_utc=False), self.format) return format_date(qt_to_dt(d, as_utc=False), self.format)
def createEditor(self, parent, option, index): def createEditor(self, parent, option, index):
qde = QStyledItemDelegate.createEditor(self, parent, option, index) return DateTimeEdit(parent, self.format)
qde.setDisplayFormat(self.format)
qde.setMinimumDateTime(UNDEFINED_QDATETIME)
qde.setSpecialValueText(_('Undefined'))
qde.setCalendarPopup(True)
return qde
def setEditorData(self, editor, index): def setEditorData(self, editor, index):
val = index.data(Qt.EditRole).toDate() val = index.data(Qt.EditRole).toDate()
@ -116,7 +126,7 @@ class PubDateDelegate(QStyledItemDelegate): # {{{
# }}} # }}}
class TextDelegate(QStyledItemDelegate): # {{{ class TextDelegate(QStyledItemDelegate): # {{{
def __init__(self, parent): def __init__(self, parent):
''' '''
Delegate for text data. If auto_complete_function needs to return a list Delegate for text data. If auto_complete_function needs to return a list
@ -153,7 +163,7 @@ class TextDelegate(QStyledItemDelegate): # {{{
#}}} #}}}
class CompleteDelegate(QStyledItemDelegate): # {{{ class CompleteDelegate(QStyledItemDelegate): # {{{
def __init__(self, parent, sep, items_func_name, space_before_sep=False): def __init__(self, parent, sep, items_func_name, space_before_sep=False):
QStyledItemDelegate.__init__(self, parent) QStyledItemDelegate.__init__(self, parent)
self.sep = sep self.sep = sep
@ -194,7 +204,7 @@ class CompleteDelegate(QStyledItemDelegate): # {{{
QStyledItemDelegate.setModelData(self, editor, model, index) QStyledItemDelegate.setModelData(self, editor, model, index)
# }}} # }}}
class LanguagesDelegate(QStyledItemDelegate): # {{{ class LanguagesDelegate(QStyledItemDelegate): # {{{
def createEditor(self, parent, option, index): def createEditor(self, parent, option, index):
editor = LanguagesEdit(parent=parent) editor = LanguagesEdit(parent=parent)
@ -210,7 +220,7 @@ class LanguagesDelegate(QStyledItemDelegate): # {{{
model.setData(index, QVariant(val), Qt.EditRole) model.setData(index, QVariant(val), Qt.EditRole)
# }}} # }}}
class CcDateDelegate(QStyledItemDelegate): # {{{ class CcDateDelegate(QStyledItemDelegate): # {{{
''' '''
Delegate for custom columns dates. Because this delegate stores the Delegate for custom columns dates. Because this delegate stores the
format as an instance variable, a new instance must be created for each format as an instance variable, a new instance must be created for each
@ -230,12 +240,7 @@ class CcDateDelegate(QStyledItemDelegate): # {{{
return format_date(qt_to_dt(d, as_utc=False), self.format) return format_date(qt_to_dt(d, as_utc=False), self.format)
def createEditor(self, parent, option, index): def createEditor(self, parent, option, index):
qde = QStyledItemDelegate.createEditor(self, parent, option, index) return DateTimeEdit(parent, self.format)
qde.setDisplayFormat(self.format)
qde.setMinimumDateTime(UNDEFINED_QDATETIME)
qde.setSpecialValueText(_('Undefined'))
qde.setCalendarPopup(True)
return qde
def setEditorData(self, editor, index): def setEditorData(self, editor, index):
m = index.model() m = index.model()
@ -254,7 +259,7 @@ class CcDateDelegate(QStyledItemDelegate): # {{{
# }}} # }}}
class CcTextDelegate(QStyledItemDelegate): # {{{ class CcTextDelegate(QStyledItemDelegate): # {{{
''' '''
Delegate for text data. Delegate for text data.
''' '''
@ -279,7 +284,7 @@ class CcTextDelegate(QStyledItemDelegate): # {{{
model.setData(index, QVariant(val), Qt.EditRole) model.setData(index, QVariant(val), Qt.EditRole)
# }}} # }}}
class CcNumberDelegate(QStyledItemDelegate): # {{{ class CcNumberDelegate(QStyledItemDelegate): # {{{
''' '''
Delegate for text/int/float data. Delegate for text/int/float data.
''' '''
@ -314,7 +319,7 @@ class CcNumberDelegate(QStyledItemDelegate): # {{{
# }}} # }}}
class CcEnumDelegate(QStyledItemDelegate): # {{{ class CcEnumDelegate(QStyledItemDelegate): # {{{
''' '''
Delegate for text/int/float data. Delegate for text/int/float data.
''' '''
@ -346,7 +351,7 @@ class CcEnumDelegate(QStyledItemDelegate): # {{{
editor.setCurrentIndex(idx) editor.setCurrentIndex(idx)
# }}} # }}}
class CcCommentsDelegate(QStyledItemDelegate): # {{{ class CcCommentsDelegate(QStyledItemDelegate): # {{{
''' '''
Delegate for comments data. Delegate for comments data.
''' '''
@ -364,7 +369,7 @@ class CcCommentsDelegate(QStyledItemDelegate): # {{{
if hasattr(QStyle, 'CE_ItemViewItem'): if hasattr(QStyle, 'CE_ItemViewItem'):
style.drawControl(QStyle.CE_ItemViewItem, option, painter) style.drawControl(QStyle.CE_ItemViewItem, option, painter)
ctx = QAbstractTextDocumentLayout.PaintContext() ctx = QAbstractTextDocumentLayout.PaintContext()
ctx.palette = option.palette #.setColor(QPalette.Text, QColor("red")); ctx.palette = option.palette # .setColor(QPalette.Text, QColor("red"));
if hasattr(QStyle, 'SE_ItemViewItemText'): if hasattr(QStyle, 'SE_ItemViewItemText'):
textRect = style.subElementRect(QStyle.SE_ItemViewItemText, option) textRect = style.subElementRect(QStyle.SE_ItemViewItemText, option)
painter.save() painter.save()
@ -387,7 +392,7 @@ class CcCommentsDelegate(QStyledItemDelegate): # {{{
model.setData(index, QVariant(editor.textbox.html), Qt.EditRole) model.setData(index, QVariant(editor.textbox.html), Qt.EditRole)
# }}} # }}}
class DelegateCB(QComboBox): # {{{ class DelegateCB(QComboBox): # {{{
def __init__(self, parent): def __init__(self, parent):
QComboBox.__init__(self, parent) QComboBox.__init__(self, parent)
@ -398,7 +403,7 @@ class DelegateCB(QComboBox): # {{{
return QComboBox.event(self, e) return QComboBox.event(self, e)
# }}} # }}}
class CcBoolDelegate(QStyledItemDelegate): # {{{ class CcBoolDelegate(QStyledItemDelegate): # {{{
def __init__(self, parent): def __init__(self, parent):
''' '''
Delegate for custom_column bool data. Delegate for custom_column bool data.
@ -431,7 +436,7 @@ class CcBoolDelegate(QStyledItemDelegate): # {{{
# }}} # }}}
class CcTemplateDelegate(QStyledItemDelegate): # {{{ class CcTemplateDelegate(QStyledItemDelegate): # {{{
def __init__(self, parent): def __init__(self, parent):
''' '''
Delegate for custom_column bool data. Delegate for custom_column bool data.
@ -457,7 +462,7 @@ class CcTemplateDelegate(QStyledItemDelegate): # {{{
validation_formatter.validate(val) validation_formatter.validate(val)
except Exception as err: except Exception as err:
error_dialog(self.parent(), _('Invalid template'), error_dialog(self.parent(), _('Invalid template'),
'<p>'+_('The template %s is invalid:')%val + \ '<p>'+_('The template %s is invalid:')%val +
'<br>'+str(err), show=True) '<br>'+str(err), show=True)
model.setData(index, QVariant(val), Qt.EditRole) model.setData(index, QVariant(val), Qt.EditRole)
@ -469,3 +474,4 @@ class CcTemplateDelegate(QStyledItemDelegate): # {{{
# }}} # }}}

View File

@ -6,7 +6,7 @@ __copyright__ = '2010, Kovid Goyal <kovid@kovidgoyal.net>'
__docformat__ = 'restructuredtext en' __docformat__ = 'restructuredtext en'
import functools, re, os, traceback, errno, time import functools, re, os, traceback, errno, time
from collections import defaultdict from collections import defaultdict, namedtuple
from PyQt4.Qt import (QAbstractTableModel, Qt, pyqtSignal, QIcon, QImage, from PyQt4.Qt import (QAbstractTableModel, Qt, pyqtSignal, QIcon, QImage,
QModelIndex, QVariant, QDateTime, QColor, QPixmap) QModelIndex, QVariant, QDateTime, QColor, QPixmap)
@ -29,6 +29,8 @@ from calibre.gui2.library import DEFAULT_SORT
from calibre.utils.localization import calibre_langcode_to_name from calibre.utils.localization import calibre_langcode_to_name
from calibre.library.coloring import color_row_key from calibre.library.coloring import color_row_key
Counts = namedtuple('Counts', 'total current')
def human_readable(size, precision=1): def human_readable(size, precision=1):
""" Convert a size in bytes into megabytes """ """ Convert a size in bytes into megabytes """
return ('%.'+str(precision)+'f') % ((size/(1024.*1024.)),) return ('%.'+str(precision)+'f') % ((size/(1024.*1024.)),)
@ -46,7 +48,7 @@ def default_image():
_default_image = QImage(I('default_cover.png')) _default_image = QImage(I('default_cover.png'))
return _default_image return _default_image
class ColumnColor(object): class ColumnColor(object): # {{{
def __init__(self, formatter, colors): def __init__(self, formatter, colors):
self.mi = None self.mi = None
@ -70,9 +72,9 @@ class ColumnColor(object):
return color return color
except: except:
pass pass
# }}}
class ColumnIcon(object): # {{{
class ColumnIcon(object):
def __init__(self, formatter): def __init__(self, formatter):
self.mi = None self.mi = None
@ -108,8 +110,9 @@ class ColumnIcon(object):
return icon_bitmap return icon_bitmap
except: except:
pass pass
# }}}
class BooksModel(QAbstractTableModel): # {{{ class BooksModel(QAbstractTableModel): # {{{
about_to_be_sorted = pyqtSignal(object, name='aboutToBeSorted') about_to_be_sorted = pyqtSignal(object, name='aboutToBeSorted')
sorting_done = pyqtSignal(object, name='sortingDone') sorting_done = pyqtSignal(object, name='sortingDone')
@ -150,7 +153,7 @@ class BooksModel(QAbstractTableModel): # {{{
self.default_image = default_image() self.default_image = default_image()
self.sorted_on = DEFAULT_SORT self.sorted_on = DEFAULT_SORT
self.sort_history = [self.sorted_on] self.sort_history = [self.sorted_on]
self.last_search = '' # The last search performed on this model self.last_search = '' # The last search performed on this model
self.column_map = [] self.column_map = []
self.headers = {} self.headers = {}
self.alignment_map = {} self.alignment_map = {}
@ -240,7 +243,6 @@ class BooksModel(QAbstractTableModel): # {{{
# Would like to to a join here, but the thread might be waiting to # Would like to to a join here, but the thread might be waiting to
# do something on the GUI thread. Deadlock. # do something on the GUI thread. Deadlock.
def refresh_ids(self, ids, current_row=-1): def refresh_ids(self, ids, current_row=-1):
self._clear_caches() self._clear_caches()
rows = self.db.refresh_ids(ids) rows = self.db.refresh_ids(ids)
@ -282,9 +284,16 @@ class BooksModel(QAbstractTableModel): # {{{
self._clear_caches() self._clear_caches()
self.count_changed_signal.emit(self.db.count()) self.count_changed_signal.emit(self.db.count())
def counts(self):
if self.db.data.search_restriction_applied():
total = self.db.data.get_search_restriction_book_count()
else:
total = self.db.count()
return Counts(total, self.count())
def row_indices(self, index): def row_indices(self, index):
''' Return list indices of all cells in index.row()''' ''' Return list indices of all cells in index.row()'''
return [ self.index(index.row(), c) for c in range(self.columnCount(None))] return [self.index(index.row(), c) for c in range(self.columnCount(None))]
@property @property
def by_author(self): def by_author(self):
@ -332,7 +341,7 @@ class BooksModel(QAbstractTableModel): # {{{
while True: while True:
row_ += 1 if forward else -1 row_ += 1 if forward else -1
if row_ < 0: if row_ < 0:
row_ = self.count() - 1; row_ = self.count() - 1
elif row_ >= self.count(): elif row_ >= self.count():
row_ = 0 row_ = 0
if self.id(row_) in self.ids_to_highlight_set: if self.id(row_) in self.ids_to_highlight_set:
@ -611,7 +620,7 @@ class BooksModel(QAbstractTableModel): # {{{
data = None data = None
try: try:
data = self.db.cover(row_number) data = self.db.cover(row_number)
except IndexError: # Happens if database has not yet been refreshed except IndexError: # Happens if database has not yet been refreshed
pass pass
if not data: if not data:
@ -673,7 +682,7 @@ class BooksModel(QAbstractTableModel): # {{{
return QVariant(UNDEFINED_QDATETIME) return QVariant(UNDEFINED_QDATETIME)
def bool_type(r, idx=-1): def bool_type(r, idx=-1):
return None # displayed using a decorator return None # displayed using a decorator
def bool_type_decorator(r, idx=-1, bool_cols_are_tristate=True): def bool_type_decorator(r, idx=-1, bool_cols_are_tristate=True):
val = force_to_bool(self.db.data[r][idx]) val = force_to_bool(self.db.data[r][idx])
@ -884,20 +893,24 @@ class BooksModel(QAbstractTableModel): # {{{
ans = Qt.AlignVCenter | ALIGNMENT_MAP[self.alignment_map.get(cname, ans = Qt.AlignVCenter | ALIGNMENT_MAP[self.alignment_map.get(cname,
'left')] 'left')]
return QVariant(ans) return QVariant(ans)
#elif role == Qt.ToolTipRole and index.isValid(): # elif role == Qt.ToolTipRole and index.isValid():
# if self.column_map[index.column()] in self.editable_cols: # if self.column_map[index.column()] in self.editable_cols:
# return QVariant(_("Double click to <b>edit</b> me<br><br>")) # return QVariant(_("Double click to <b>edit</b> me<br><br>"))
return NONE return NONE
def headerData(self, section, orientation, role): def headerData(self, section, orientation, role):
if orientation == Qt.Horizontal: if orientation == Qt.Horizontal:
if section >= len(self.column_map): # same problem as in data, the column_map can be wrong if section >= len(self.column_map): # same problem as in data, the column_map can be wrong
return None return None
if role == Qt.ToolTipRole: if role == Qt.ToolTipRole:
ht = self.column_map[section] ht = self.column_map[section]
if ht == 'timestamp': # change help text because users know this field as 'date' if ht == 'timestamp': # change help text because users know this field as 'date'
ht = 'date' ht = 'date'
return QVariant(_('The lookup/search name is "{0}"').format(ht)) if self.db.field_metadata[self.column_map[section]]['is_category']:
is_cat = '.\n\n' + _('Click in this column and press Q to to Quickview books with the same %s' % ht)
else:
is_cat = ''
return QVariant(_('The lookup/search name is "{0}"{1}').format(ht, is_cat))
if role == Qt.DisplayRole: if role == Qt.DisplayRole:
return QVariant(self.headers[self.column_map[section]]) return QVariant(self.headers[self.column_map[section]])
return NONE return NONE
@ -905,11 +918,10 @@ class BooksModel(QAbstractTableModel): # {{{
col = self.db.field_metadata['uuid']['rec_index'] col = self.db.field_metadata['uuid']['rec_index']
return QVariant(_('This book\'s UUID is "{0}"').format(self.db.data[section][col])) return QVariant(_('This book\'s UUID is "{0}"').format(self.db.data[section][col]))
if role == Qt.DisplayRole: # orientation is vertical if role == Qt.DisplayRole: # orientation is vertical
return QVariant(section+1) return QVariant(section+1)
return NONE return NONE
def flags(self, index): def flags(self, index):
flags = QAbstractTableModel.flags(self, index) flags = QAbstractTableModel.flags(self, index)
if index.isValid(): if index.isValid():
@ -969,7 +981,7 @@ class BooksModel(QAbstractTableModel): # {{{
tmpl = unicode(value.toString()).strip() tmpl = unicode(value.toString()).strip()
disp = cc['display'] disp = cc['display']
disp['composite_template'] = tmpl disp['composite_template'] = tmpl
self.db.set_custom_column_metadata(cc['colnum'], display = disp) self.db.set_custom_column_metadata(cc['colnum'], display=disp)
self.refresh(reset=True) self.refresh(reset=True)
return True return True
@ -987,7 +999,7 @@ class BooksModel(QAbstractTableModel): # {{{
return self._set_data(index, value) return self._set_data(index, value)
except (IOError, OSError) as err: except (IOError, OSError) as err:
import traceback import traceback
if getattr(err, 'errno', None) == errno.EACCES: # Permission denied if getattr(err, 'errno', None) == errno.EACCES: # Permission denied
fname = getattr(err, 'filename', None) fname = getattr(err, 'filename', None)
p = 'Locked file: %s\n\n'%fname if fname else '' p = 'Locked file: %s\n\n'%fname if fname else ''
error_dialog(get_gui(), _('Permission denied'), error_dialog(get_gui(), _('Permission denied'),
@ -1017,7 +1029,7 @@ class BooksModel(QAbstractTableModel): # {{{
return False return False
val = (int(value.toInt()[0]) if column == 'rating' else val = (int(value.toInt()[0]) if column == 'rating' else
value.toDateTime() if column in ('timestamp', 'pubdate') value.toDateTime() if column in ('timestamp', 'pubdate')
else unicode(value.toString()).strip()) else re.sub(ur'\s', u' ', unicode(value.toString()).strip()))
id = self.db.id(row) id = self.db.id(row)
books_to_refresh = set([id]) books_to_refresh = set([id])
if column == 'rating': if column == 'rating':
@ -1065,7 +1077,7 @@ class BooksModel(QAbstractTableModel): # {{{
# }}} # }}}
class OnDeviceSearch(SearchQueryParser): # {{{ class OnDeviceSearch(SearchQueryParser): # {{{
USABLE_LOCATIONS = [ USABLE_LOCATIONS = [
'all', 'all',
@ -1078,7 +1090,6 @@ class OnDeviceSearch(SearchQueryParser): # {{{
'inlibrary' 'inlibrary'
] ]
def __init__(self, model): def __init__(self, model):
SearchQueryParser.__init__(self, locations=self.USABLE_LOCATIONS) SearchQueryParser.__init__(self, locations=self.USABLE_LOCATIONS)
self.model = model self.model = model
@ -1101,7 +1112,7 @@ class OnDeviceSearch(SearchQueryParser): # {{{
elif query.startswith('~'): elif query.startswith('~'):
matchkind = REGEXP_MATCH matchkind = REGEXP_MATCH
query = query[1:] query = query[1:]
if matchkind != REGEXP_MATCH: ### leave case in regexps because it can be significant e.g. \S \W \D if matchkind != REGEXP_MATCH: # leave case in regexps because it can be significant e.g. \S \W \D
query = query.lower() query = query.lower()
if location not in self.USABLE_LOCATIONS: if location not in self.USABLE_LOCATIONS:
@ -1133,9 +1144,9 @@ class OnDeviceSearch(SearchQueryParser): # {{{
if locvalue == 'inlibrary': if locvalue == 'inlibrary':
continue # this is bool, so can't match below continue # this is bool, so can't match below
try: try:
### Can't separate authors because comma is used for name sep and author sep # Can't separate authors because comma is used for name sep and author sep
### Exact match might not get what you want. For that reason, turn author # Exact match might not get what you want. For that reason, turn author
### exactmatch searches into contains searches. # exactmatch searches into contains searches.
if locvalue == 'author' and matchkind == EQUALS_MATCH: if locvalue == 'author' and matchkind == EQUALS_MATCH:
m = CONTAINS_MATCH m = CONTAINS_MATCH
else: else:
@ -1148,13 +1159,13 @@ class OnDeviceSearch(SearchQueryParser): # {{{
if _match(query, vals, m, use_primary_find_in_search=upf): if _match(query, vals, m, use_primary_find_in_search=upf):
matches.add(index) matches.add(index)
break break
except ValueError: # Unicode errors except ValueError: # Unicode errors
traceback.print_exc() traceback.print_exc()
return matches return matches
# }}} # }}}
class DeviceDBSortKeyGen(object): # {{{ class DeviceDBSortKeyGen(object): # {{{
def __init__(self, attr, keyfunc, db): def __init__(self, attr, keyfunc, db):
self.attr = attr self.attr = attr
@ -1169,7 +1180,7 @@ class DeviceDBSortKeyGen(object): # {{{
return ans return ans
# }}} # }}}
class DeviceBooksModel(BooksModel): # {{{ class DeviceBooksModel(BooksModel): # {{{
booklist_dirtied = pyqtSignal() booklist_dirtied = pyqtSignal()
upload_collections = pyqtSignal(object) upload_collections = pyqtSignal(object)
@ -1198,6 +1209,12 @@ class DeviceBooksModel(BooksModel): # {{{
self.editable = ['title', 'authors', 'collections'] self.editable = ['title', 'authors', 'collections']
self.book_in_library = None self.book_in_library = None
def counts(self):
return Counts(len(self.db), len(self.map))
def count_changed(self, *args):
self.count_changed_signal.emit(len(self.db))
def mark_for_deletion(self, job, rows, rows_are_ids=False): def mark_for_deletion(self, job, rows, rows_are_ids=False):
db_indices = rows if rows_are_ids else self.indices(rows) db_indices = rows if rows_are_ids else self.indices(rows)
db_items = [self.db[i] for i in db_indices if -1 < i < len(self.db)] db_items = [self.db[i] for i in db_indices if -1 < i < len(self.db)]
@ -1237,11 +1254,13 @@ class DeviceBooksModel(BooksModel): # {{{
if not succeeded: if not succeeded:
indices = self.row_indices(self.index(row, 0)) indices = self.row_indices(self.index(row, 0))
self.dataChanged.emit(indices[0], indices[-1]) self.dataChanged.emit(indices[0], indices[-1])
self.count_changed()
def paths_deleted(self, paths): def paths_deleted(self, paths):
self.map = list(range(0, len(self.db))) self.map = list(range(0, len(self.db)))
self.resort(False) self.resort(False)
self.research(True) self.research(True)
self.count_changed()
def is_row_marked_for_deletion(self, row): def is_row_marked_for_deletion(self, row):
try: try:
@ -1272,9 +1291,9 @@ class DeviceBooksModel(BooksModel): # {{{
if index.isValid(): if index.isValid():
cname = self.column_map[index.column()] cname = self.column_map[index.column()]
if cname in self.editable and \ if cname in self.editable and \
(cname != 'collections' or \ (cname != 'collections' or
(callable(getattr(self.db, 'supports_collections', None)) and \ (callable(getattr(self.db, 'supports_collections', None)) and
self.db.supports_collections() and \ self.db.supports_collections() and
device_prefs['manage_device_metadata']=='manual')): device_prefs['manage_device_metadata']=='manual')):
flags |= Qt.ItemIsEditable flags |= Qt.ItemIsEditable
return flags return flags
@ -1304,6 +1323,7 @@ class DeviceBooksModel(BooksModel): # {{{
self.last_search = text self.last_search = text
if self.last_search: if self.last_search:
self.searched.emit(True) self.searched.emit(True)
self.count_changed()
def research(self, reset=True): def research(self, reset=True):
self.search(self.last_search, reset) self.search(self.last_search, reset)
@ -1373,6 +1393,7 @@ class DeviceBooksModel(BooksModel): # {{{
self.map = list(range(0, len(db))) self.map = list(range(0, len(db)))
self.research(reset=False) self.research(reset=False)
self.resort() self.resort()
self.count_changed()
def cover(self, row): def cover(self, row):
item = self.db[self.map[row]] item = self.db[self.map[row]]
@ -1432,7 +1453,7 @@ class DeviceBooksModel(BooksModel): # {{{
return data return data
def paths(self, rows): def paths(self, rows):
return [self.db[self.map[r.row()]].path for r in rows ] return [self.db[self.map[r.row()]].path for r in rows]
def paths_for_db_ids(self, db_ids, as_map=False): def paths_for_db_ids(self, db_ids, as_map=False):
res = defaultdict(list) if as_map else [] res = defaultdict(list) if as_map else []
@ -1517,7 +1538,7 @@ class DeviceBooksModel(BooksModel): # {{{
elif role == Qt.ToolTipRole and index.isValid(): elif role == Qt.ToolTipRole and index.isValid():
if self.is_row_marked_for_deletion(row): if self.is_row_marked_for_deletion(row):
return QVariant(_('Marked for deletion')) return QVariant(_('Marked for deletion'))
if cname in ['title', 'authors'] or (cname == 'collections' and \ if cname in ['title', 'authors'] or (cname == 'collections' and
self.db.supports_collections()): self.db.supports_collections()):
return QVariant(_("Double click to <b>edit</b> me<br><br>")) return QVariant(_("Double click to <b>edit</b> me<br><br>"))
elif role == Qt.DecorationRole and cname == 'inlibrary': elif role == Qt.DecorationRole and cname == 'inlibrary':
@ -1586,3 +1607,4 @@ class DeviceBooksModel(BooksModel): # {{{
# }}} # }}}

View File

@ -10,9 +10,9 @@ from functools import partial
from future_builtins import map from future_builtins import map
from collections import OrderedDict from collections import OrderedDict
from PyQt4.Qt import (QTableView, Qt, QAbstractItemView, QMenu, pyqtSignal, from PyQt4.Qt import (QTableView, Qt, QAbstractItemView, QMenu, pyqtSignal, QFont,
QModelIndex, QIcon, QItemSelection, QMimeData, QDrag, QApplication, QModelIndex, QIcon, QItemSelection, QMimeData, QDrag, QApplication, QStyle,
QPoint, QPixmap, QUrl, QImage, QPainter, QColor, QRect) QPoint, QPixmap, QUrl, QImage, QPainter, QColor, QRect, QHeaderView, QStyleOptionHeader)
from calibre.gui2.library.delegates import (RatingDelegate, PubDateDelegate, from calibre.gui2.library.delegates import (RatingDelegate, PubDateDelegate,
TextDelegate, DateDelegate, CompleteDelegate, CcTextDelegate, TextDelegate, DateDelegate, CompleteDelegate, CcTextDelegate,
@ -25,7 +25,55 @@ from calibre.gui2.library import DEFAULT_SORT
from calibre.constants import filesystem_encoding from calibre.constants import filesystem_encoding
from calibre import force_unicode from calibre import force_unicode
class PreserveViewState(object): # {{{ class HeaderView(QHeaderView): # {{{
def __init__(self, *args):
QHeaderView.__init__(self, *args)
self.hover = -1
self.current_font = QFont(self.font())
self.current_font.setBold(True)
self.current_font.setItalic(True)
def event(self, e):
if e.type() in (e.HoverMove, e.HoverEnter):
self.hover = self.logicalIndexAt(e.pos())
elif e.type() in (e.Leave, e.HoverLeave):
self.hover = -1
return QHeaderView.event(self, e)
def paintSection(self, painter, rect, logical_index):
opt = QStyleOptionHeader()
self.initStyleOption(opt)
opt.rect = rect
opt.section = logical_index
opt.orientation = self.orientation()
opt.textAlignment = Qt.AlignHCenter | Qt.AlignVCenter
model = self.parent().model()
opt.text = model.headerData(logical_index, opt.orientation, Qt.DisplayRole).toString()
if self.isSortIndicatorShown() and self.sortIndicatorSection() == logical_index:
opt.sortIndicator = QStyleOptionHeader.SortDown if self.sortIndicatorOrder() == Qt.AscendingOrder else QStyleOptionHeader.SortUp
opt.text = opt.fontMetrics.elidedText(opt.text, Qt.ElideRight, rect.width() - 4)
if self.isEnabled():
opt.state |= QStyle.State_Enabled
if self.window().isActiveWindow():
opt.state |= QStyle.State_Active
if self.hover == logical_index:
opt.state |= QStyle.State_MouseOver
sm = self.selectionModel()
if opt.orientation == Qt.Vertical:
if sm.isRowSelected(logical_index, QModelIndex()):
opt.state |= QStyle.State_Sunken
painter.save()
if (
(opt.orientation == Qt.Horizontal and sm.currentIndex().column() == logical_index) or
(opt.orientation == Qt.Vertical and sm.currentIndex().row() == logical_index)):
painter.setFont(self.current_font)
self.style().drawControl(QStyle.CE_Header, opt, painter, self)
painter.restore()
# }}}
class PreserveViewState(object): # {{{
''' '''
Save the set of selected books at enter time. If at exit time there are no Save the set of selected books at enter time. If at exit time there are no
@ -72,13 +120,14 @@ class PreserveViewState(object): # {{{
return {x:getattr(self, x) for x in ('selected_ids', 'current_id', return {x:getattr(self, x) for x in ('selected_ids', 'current_id',
'vscroll', 'hscroll')} 'vscroll', 'hscroll')}
def fset(self, state): def fset(self, state):
for k, v in state.iteritems(): setattr(self, k, v) for k, v in state.iteritems():
setattr(self, k, v)
self.__exit__() self.__exit__()
return property(fget=fget, fset=fset) return property(fget=fget, fset=fset)
# }}} # }}}
class BooksView(QTableView): # {{{ class BooksView(QTableView): # {{{
files_dropped = pyqtSignal(object) files_dropped = pyqtSignal(object)
add_column_signal = pyqtSignal() add_column_signal = pyqtSignal()
@ -90,6 +139,7 @@ class BooksView(QTableView): # {{{
def __init__(self, parent, modelcls=BooksModel, use_edit_metadata_dialog=True): def __init__(self, parent, modelcls=BooksModel, use_edit_metadata_dialog=True):
QTableView.__init__(self, parent) QTableView.__init__(self, parent)
self.setProperty('highlight_current_item', 150)
self.row_sizing_done = False self.row_sizing_done = False
if not tweaks['horizontal_scrolling_per_column']: if not tweaks['horizontal_scrolling_per_column']:
@ -152,12 +202,16 @@ class BooksView(QTableView): # {{{
# {{{ Column Header setup # {{{ Column Header setup
self.can_add_columns = True self.can_add_columns = True
self.was_restored = False self.was_restored = False
self.column_header = self.horizontalHeader() self.column_header = HeaderView(Qt.Horizontal, self)
self.setHorizontalHeader(self.column_header)
self.column_header.setMovable(True) self.column_header.setMovable(True)
self.column_header.setClickable(True)
self.column_header.sectionMoved.connect(self.save_state) self.column_header.sectionMoved.connect(self.save_state)
self.column_header.setContextMenuPolicy(Qt.CustomContextMenu) self.column_header.setContextMenuPolicy(Qt.CustomContextMenu)
self.column_header.customContextMenuRequested.connect(self.show_column_header_context_menu) self.column_header.customContextMenuRequested.connect(self.show_column_header_context_menu)
self.column_header.sectionResized.connect(self.column_resized, Qt.QueuedConnection) self.column_header.sectionResized.connect(self.column_resized, Qt.QueuedConnection)
self.row_header = HeaderView(Qt.Vertical, self)
self.setVerticalHeader(self.row_header)
# }}} # }}}
self._model.database_changed.connect(self.database_changed) self._model.database_changed.connect(self.database_changed)
@ -197,6 +251,16 @@ class BooksView(QTableView): # {{{
elif action.startswith('align_'): elif action.startswith('align_'):
alignment = action.partition('_')[-1] alignment = action.partition('_')[-1]
self._model.change_alignment(column, alignment) self._model.change_alignment(column, alignment)
elif action == 'quickview':
from calibre.customize.ui import find_plugin
qv = find_plugin('Show Quickview')
if qv:
rows = self.selectionModel().selectedRows()
if len(rows) > 0:
current_row = rows[0].row()
current_col = self.column_map.index(column)
index = self.model().index(current_row, current_col)
qv.actual_plugin_.change_quickview_column(index)
self.save_state() self.save_state()
@ -225,7 +289,7 @@ class BooksView(QTableView): # {{{
ac.setCheckable(True) ac.setCheckable(True)
ac.setChecked(True) ac.setChecked(True)
if col not in ('ondevice', 'inlibrary') and \ if col not in ('ondevice', 'inlibrary') and \
(not self.model().is_custom_column(col) or \ (not self.model().is_custom_column(col) or
self.model().custom_columns[col]['datatype'] not in ('bool', self.model().custom_columns[col]['datatype'] not in ('bool',
)): )):
m = self.column_header_context_menu.addMenu( m = self.column_header_context_menu.addMenu(
@ -240,7 +304,14 @@ class BooksView(QTableView): # {{{
a.setCheckable(True) a.setCheckable(True)
a.setChecked(True) a.setChecked(True)
if self._model.db.field_metadata[col]['is_category']:
act = self.column_header_context_menu.addAction(_('Quickview column %s') %
name,
partial(self.column_header_context_handler, action='quickview',
column=col))
rows = self.selectionModel().selectedRows()
if len(rows) > 1:
act.setEnabled(False)
hidden_cols = [self.column_map[i] for i in hidden_cols = [self.column_map[i] for i in
range(self.column_header.count()) if range(self.column_header.count()) if
@ -260,7 +331,6 @@ class BooksView(QTableView): # {{{
partial(self.column_header_context_handler, partial(self.column_header_context_handler,
action='show', column=col)) action='show', column=col))
self.column_header_context_menu.addSeparator() self.column_header_context_menu.addSeparator()
self.column_header_context_menu.addAction( self.column_header_context_menu.addAction(
_('Shrink column if it is too wide to fit'), _('Shrink column if it is too wide to fit'),
@ -349,7 +419,7 @@ class BooksView(QTableView): # {{{
h = self.column_header h = self.column_header
cm = self.column_map cm = self.column_map
state = {} state = {}
state['hidden_columns'] = [cm[i] for i in range(h.count()) state['hidden_columns'] = [cm[i] for i in range(h.count())
if h.isSectionHidden(i) and cm[i] != 'ondevice'] if h.isSectionHidden(i) and cm[i] != 'ondevice']
state['last_modified_injected'] = True state['last_modified_injected'] = True
state['languages_injected'] = True state['languages_injected'] = True
@ -497,7 +567,6 @@ class BooksView(QTableView): # {{{
db.prefs[name] = ans db.prefs[name] = ans
return ans return ans
def restore_state(self): def restore_state(self):
old_state = self.get_old_state() old_state = self.get_old_state()
if old_state is None: if old_state is None:
@ -820,7 +889,8 @@ class BooksView(QTableView): # {{{
ids = frozenset(ids) ids = frozenset(ids)
m = self.model() m = self.model()
for row in xrange(m.rowCount(QModelIndex())): for row in xrange(m.rowCount(QModelIndex())):
if len(row_map) >= len(ids): break if len(row_map) >= len(ids):
break
c = m.id(row) c = m.id(row)
if c in ids: if c in ids:
row_map[c] = row row_map[c] = row
@ -880,7 +950,8 @@ class BooksView(QTableView): # {{{
pass pass
return None return None
def fset(self, val): def fset(self, val):
if val is None: return if val is None:
return
m = self.model() m = self.model()
for row in xrange(m.rowCount(QModelIndex())): for row in xrange(m.rowCount(QModelIndex())):
if m.id(row) == val: if m.id(row) == val:
@ -902,7 +973,8 @@ class BooksView(QTableView): # {{{
column = ci.column() column = ci.column()
for i in xrange(ci.row()+1, self.row_count()): for i in xrange(ci.row()+1, self.row_count()):
if i in selected_rows: continue if i in selected_rows:
continue
try: try:
return self.model().id(self.model().index(i, column)) return self.model().id(self.model().index(i, column))
except: except:
@ -910,7 +982,8 @@ class BooksView(QTableView): # {{{
# No unselected rows after the current row, look before # No unselected rows after the current row, look before
for i in xrange(ci.row()-1, -1, -1): for i in xrange(ci.row()-1, -1, -1):
if i in selected_rows: continue if i in selected_rows:
continue
try: try:
return self.model().id(self.model().index(i, column)) return self.model().id(self.model().index(i, column))
except: except:
@ -958,7 +1031,7 @@ class BooksView(QTableView): # {{{
# }}} # }}}
class DeviceBooksView(BooksView): # {{{ class DeviceBooksView(BooksView): # {{{
def __init__(self, parent): def __init__(self, parent):
BooksView.__init__(self, parent, DeviceBooksModel, BooksView.__init__(self, parent, DeviceBooksModel,

View File

@ -13,7 +13,7 @@ from PyQt4.Qt import (Qt, QDateTimeEdit, pyqtSignal, QMessageBox, QIcon,
QToolButton, QWidget, QLabel, QGridLayout, QApplication, QToolButton, QWidget, QLabel, QGridLayout, QApplication,
QDoubleSpinBox, QListWidgetItem, QSize, QPixmap, QDialog, QMenu, QDoubleSpinBox, QListWidgetItem, QSize, QPixmap, QDialog, QMenu,
QPushButton, QSpinBox, QLineEdit, QSizePolicy, QDialogButtonBox, QPushButton, QSpinBox, QLineEdit, QSizePolicy, QDialogButtonBox,
QAction, QCalendarWidget, QDate) QAction, QCalendarWidget, QDate, QDateTime)
from calibre.gui2.widgets import EnLineEdit, FormatList as _FormatList, ImageView from calibre.gui2.widgets import EnLineEdit, FormatList as _FormatList, ImageView
from calibre.utils.icu import sort_key from calibre.utils.icu import sort_key
@ -45,6 +45,9 @@ def save_dialog(parent, title, msg, det_msg=''):
d.setStandardButtons(QMessageBox.Yes | QMessageBox.No | QMessageBox.Cancel) d.setStandardButtons(QMessageBox.Yes | QMessageBox.No | QMessageBox.Cancel)
return d.exec_() return d.exec_()
def clean_text(x):
return re.sub(r'\s', ' ', x.strip())
''' '''
The interface common to all widgets used to set basic metadata The interface common to all widgets used to set basic metadata
class BasicMetadataWidget(object): class BasicMetadataWidget(object):
@ -117,7 +120,7 @@ class TitleEdit(EnLineEdit):
def current_val(self): def current_val(self):
def fget(self): def fget(self):
title = unicode(self.text()).strip() title = clean_text(unicode(self.text()))
if not title: if not title:
title = self.get_default() title = self.get_default()
return title return title
@ -289,7 +292,7 @@ class AuthorsEdit(EditWithComplete):
def current_val(self): def current_val(self):
def fget(self): def fget(self):
au = unicode(self.text()).strip() au = clean_text(unicode(self.text()))
if not au: if not au:
au = self.get_default() au = self.get_default()
return string_to_authors(au) return string_to_authors(au)
@ -352,7 +355,7 @@ class AuthorSortEdit(EnLineEdit):
def current_val(self): def current_val(self):
def fget(self): def fget(self):
return unicode(self.text()).strip() return clean_text(unicode(self.text()))
def fset(self, val): def fset(self, val):
if not val: if not val:
@ -472,7 +475,7 @@ class SeriesEdit(EditWithComplete):
def current_val(self): def current_val(self):
def fget(self): def fget(self):
return unicode(self.currentText()).strip() return clean_text(unicode(self.currentText()))
def fset(self, val): def fset(self, val):
if not val: if not val:
@ -1135,7 +1138,7 @@ class TagsEdit(EditWithComplete): # {{{
@dynamic_property @dynamic_property
def current_val(self): def current_val(self):
def fget(self): def fget(self):
return [x.strip() for x in unicode(self.text()).split(',')] return [clean_text(x) for x in unicode(self.text()).split(',')]
def fset(self, val): def fset(self, val):
if not val: if not val:
val = [] val = []
@ -1237,7 +1240,7 @@ class IdentifiersEdit(QLineEdit): # {{{
def current_val(self): def current_val(self):
def fget(self): def fget(self):
raw = unicode(self.text()).strip() raw = unicode(self.text()).strip()
parts = [x.strip() for x in raw.split(',')] parts = [clean_text(x) for x in raw.split(',')]
ans = {} ans = {}
for x in parts: for x in parts:
c = x.split(':') c = x.split(':')
@ -1376,7 +1379,7 @@ class PublisherEdit(EditWithComplete): # {{{
def current_val(self): def current_val(self):
def fget(self): def fget(self):
return unicode(self.currentText()).strip() return clean_text(unicode(self.currentText()))
def fset(self, val): def fset(self, val):
if not val: if not val:
@ -1472,6 +1475,16 @@ class DateEdit(QDateTimeEdit):
o, c = self.original_val, self.current_val o, c = self.original_val, self.current_val
return o != c return o != c
def keyPressEvent(self, ev):
if ev.key() == Qt.Key_Minus:
ev.accept()
self.setDateTime(self.minimumDateTime())
elif ev.key() == Qt.Key_Equal:
ev.accept()
self.setDateTime(QDateTime.currentDateTime())
else:
return QDateTimeEdit.keyPressEvent(self, ev)
class PubdateEdit(DateEdit): class PubdateEdit(DateEdit):
LABEL = _('Publishe&d:') LABEL = _('Publishe&d:')
FMT = 'MMM yyyy' FMT = 'MMM yyyy'

View File

@ -636,10 +636,20 @@ class RulesModel(QAbstractListModel): # {{{
def rule_to_html(self, kind, col, rule): def rule_to_html(self, kind, col, rule):
if not isinstance(rule, Rule): if not isinstance(rule, Rule):
return _(''' if kind == 'color':
<p>Advanced Rule for column <b>%(col)s</b>: return _('''
<pre>%(rule)s</pre> <p>Advanced Rule for column <b>%(col)s</b>:
''')%dict(col=col, rule=prepare_string_for_xml(rule)) <pre>%(rule)s</pre>
''')%dict(col=col, rule=prepare_string_for_xml(rule))
else:
return _('''
<p>Advanced Rule: set <b>%(typ)s</b> for column <b>%(col)s</b>:
<pre>%(rule)s</pre>
''')%dict(col=col,
typ=icon_rule_kinds[0][0]
if kind == icon_rule_kinds[0][1] else icon_rule_kinds[1][0],
rule=prepare_string_for_xml(rule))
conditions = [self.condition_to_html(c) for c in rule.conditions] conditions = [self.condition_to_html(c) for c in rule.conditions]
trans_kind = 'not found' trans_kind = 'not found'
@ -761,7 +771,7 @@ class EditRules(QWidget): # {{{
' what icon to use. Click the Add Rule button below' ' what icon to use. Click the Add Rule button below'
' to get started.<p>You can <b>change an existing rule</b> by' ' to get started.<p>You can <b>change an existing rule</b> by'
' double clicking it.')) ' double clicking it.'))
self.add_advanced_button.setVisible(False) # self.add_advanced_button.setVisible(False)
def add_rule(self): def add_rule(self):
d = RuleEditor(self.model.fm, self.pref_name) d = RuleEditor(self.model.fm, self.pref_name)
@ -774,13 +784,23 @@ class EditRules(QWidget): # {{{
self.changed.emit() self.changed.emit()
def add_advanced(self): def add_advanced(self):
td = TemplateDialog(self, '', mi=self.mi, fm=self.fm, color_field='') if self.pref_name == 'column_color_rules':
if td.exec_() == td.Accepted: td = TemplateDialog(self, '', mi=self.mi, fm=self.fm, color_field='')
col, r = td.rule if td.exec_() == td.Accepted:
if r and col: col, r = td.rule
idx = self.model.add_rule('color', col, r) if r and col:
self.rules_view.scrollTo(idx) idx = self.model.add_rule('color', col, r)
self.changed.emit() self.rules_view.scrollTo(idx)
self.changed.emit()
else:
td = TemplateDialog(self, '', mi=self.mi, fm=self.fm, icon_field_key='')
if td.exec_() == td.Accepted:
print(td.rule)
typ, col, r = td.rule
if typ and r and col:
idx = self.model.add_rule(typ, col, r)
self.rules_view.scrollTo(idx)
self.changed.emit()
def edit_rule(self, index): def edit_rule(self, index):
try: try:
@ -790,8 +810,12 @@ class EditRules(QWidget): # {{{
if isinstance(rule, Rule): if isinstance(rule, Rule):
d = RuleEditor(self.model.fm, self.pref_name) d = RuleEditor(self.model.fm, self.pref_name)
d.apply_rule(kind, col, rule) d.apply_rule(kind, col, rule)
else: elif self.pref_name == 'column_color_rules':
d = TemplateDialog(self, rule, mi=self.mi, fm=self.fm, color_field=col) d = TemplateDialog(self, rule, mi=self.mi, fm=self.fm, color_field=col)
else:
d = TemplateDialog(self, rule, mi=self.mi, fm=self.fm, icon_field_key=col,
icon_rule_kind=kind)
if d.exec_() == d.Accepted: if d.exec_() == d.Accepted:
if len(d.rule) == 2: # Convert template dialog rules to a triple if len(d.rule) == 2: # Convert template dialog rules to a triple
d.rule = ('color', d.rule[0], d.rule[1]) d.rule = ('color', d.rule[0], d.rule[1])

View File

@ -172,7 +172,10 @@ class Tweaks(QAbstractListModel, SearchQueryParser): # {{{
doc.append(line[1:].strip()) doc.append(line[1:].strip())
doc = '\n'.join(doc) doc = '\n'.join(doc)
while True: while True:
line = lines[pos] try:
line = lines[pos]
except IndexError:
break
if not line.strip(): if not line.strip():
break break
spidx1 = line.find(' ') spidx1 = line.find(' ')

View File

@ -146,8 +146,12 @@ class CreateVirtualLibrary(QDialog): # {{{
<p>For example you can use a Virtual Library to only show you books with the Tag <i>"Unread"</i> <p>For example you can use a Virtual Library to only show you books with the Tag <i>"Unread"</i>
or only books by <i>"My Favorite Author"</i> or only books in a particular series.</p> or only books by <i>"My Favorite Author"</i> or only books in a particular series.</p>
<p>More information and examples are available in the
<a href="http://manual.calibre-ebook.com/virtual_libraries.html">User Manual</a>.</p>
''')) '''))
hl.setWordWrap(True) hl.setWordWrap(True)
hl.setOpenExternalLinks(True)
hl.setFrameStyle(hl.StyledPanel) hl.setFrameStyle(hl.StyledPanel)
gl.addWidget(hl, 0, 3, 4, 1) gl.addWidget(hl, 0, 3, 4, 1)

View File

@ -1,7 +1,7 @@
# -*- coding: utf-8 -*- # -*- coding: utf-8 -*-
from __future__ import (unicode_literals, division, absolute_import, print_function) from __future__ import (unicode_literals, division, absolute_import, print_function)
store_version = 1 # Needed for dynamic plugin loading store_version = 2 # Needed for dynamic plugin loading
__license__ = 'GPL 3' __license__ = 'GPL 3'
__copyright__ = '2011, John Schember <john@nachtimwald.com>' __copyright__ = '2011, John Schember <john@nachtimwald.com>'
@ -54,14 +54,13 @@ class FoylesUKStore(BasicStoreConfig, StorePlugin):
id_ = ''.join(data.xpath('.//p[@class="doc-cover"]/a/@href')).strip() id_ = ''.join(data.xpath('.//p[@class="doc-cover"]/a/@href')).strip()
if not id_: if not id_:
continue continue
id_ = 'http://ebooks.foyles.co.uk' + id_
cover_url = ''.join(data.xpath('.//p[@class="doc-cover"]/a/img/@src')) cover_url = ''.join(data.xpath('.//p[@class="doc-cover"]/a/img/@src'))
title = ''.join(data.xpath('.//span[@class="title"]/a/text()')) title = ''.join(data.xpath('.//span[@class="title"]/a/text()'))
author = ', '.join(data.xpath('.//span[@class="author"]/span[@class="author"]/text()')) author = ', '.join(data.xpath('.//span[@class="author"]/span[@class="author"]/text()'))
price = ''.join(data.xpath('.//span[@itemprop="price"]/text()')) price = ''.join(data.xpath('.//span[@itemprop="price"]/text()')).strip()
format_ = ''.join(data.xpath('.//p[@class="doc-meta-format"]/span[last()]/text()')) format_ = ''.join(data.xpath('.//p[@class="doc-meta-format"]/span[last()]/text()'))
format_, ign, drm = format_.partition(' ')
drm = SearchResult.DRM_LOCKED if 'DRM' in drm else SearchResult.DRM_UNLOCKED
counter -= 1 counter -= 1
@ -71,7 +70,7 @@ class FoylesUKStore(BasicStoreConfig, StorePlugin):
s.author = author.strip() s.author = author.strip()
s.price = price s.price = price
s.detail_item = id_ s.detail_item = id_
s.drm = drm s.drm = SearchResult.DRM_LOCKED
s.formats = format_ s.formats = format_
yield s yield s

View File

@ -1,7 +1,7 @@
# -*- coding: utf-8 -*- # -*- coding: utf-8 -*-
from __future__ import (division, absolute_import, print_function) from __future__ import (division, absolute_import, print_function)
store_version = 2 # Needed for dynamic plugin loading store_version = 3 # Needed for dynamic plugin loading
__license__ = 'GPL 3' __license__ = 'GPL 3'
__copyright__ = '2013, Tomasz Długosz <tomek3d@gmail.com>' __copyright__ = '2013, Tomasz Długosz <tomek3d@gmail.com>'
@ -25,21 +25,20 @@ from calibre.gui2.store.web_store_dialog import WebStoreDialog
class KoobeStore(BasicStoreConfig, StorePlugin): class KoobeStore(BasicStoreConfig, StorePlugin):
def open(self, parent=None, detail_item=None, external=False): def open(self, parent=None, detail_item=None, external=False):
#aff_root = 'https://www.a4b-tracking.com/pl/stat-click-text-link/15/58/' aff_root = 'https://www.a4b-tracking.com/pl/stat-click-text-link/15/58/'
url = 'http://www.koobe.pl/' url = 'http://www.koobe.pl/'
#aff_url = aff_root + str(b64encode(url)) aff_url = aff_root + str(b64encode(url))
detail_url = None detail_url = None
if detail_item: if detail_item:
detail_url = detail_item #aff_root + str(b64encode(detail_item)) detail_url = aff_root + str(b64encode(detail_item))
if external or self.config.get('open_external', False): if external or self.config.get('open_external', False):
#open_url(QUrl(url_slash_cleaner(detail_url if detail_url else aff_url))) open_url(QUrl(url_slash_cleaner(detail_url if detail_url else aff_url)))
open_url(QUrl(url_slash_cleaner(detail_url if detail_url else url)))
else: else:
#d = WebStoreDialog(self.gui, url, parent, detail_url if detail_url else aff_url) d = WebStoreDialog(self.gui, url, parent, detail_url if detail_url else aff_url)
d = WebStoreDialog(self.gui, url, parent, detail_url if detail_url else url)
d.setWindowTitle(self.name) d.setWindowTitle(self.name)
d.set_tags(self.config.get('tags', '')) d.set_tags(self.config.get('tags', ''))
d.exec_() d.exec_()
@ -64,7 +63,7 @@ class KoobeStore(BasicStoreConfig, StorePlugin):
cover_url = ''.join(data.xpath('.//div[@class="cover"]/a/img/@src')) cover_url = ''.join(data.xpath('.//div[@class="cover"]/a/img/@src'))
price = ''.join(data.xpath('.//span[@class="current_price"]/text()')) price = ''.join(data.xpath('.//span[@class="current_price"]/text()'))
title = ''.join(data.xpath('.//h2[@class="title"]/a/text()')) title = ''.join(data.xpath('.//h2[@class="title"]/a/text()'))
author = ''.join(data.xpath('.//h3[@class="book_author"]/a/text()')) author = ', '.join(data.xpath('.//h3[@class="book_author"]/a/text()'))
formats = ', '.join(data.xpath('.//div[@class="formats"]/div/div/@title')) formats = ', '.join(data.xpath('.//div[@class="formats"]/div/div/@title'))
counter -= 1 counter -= 1

Some files were not shown because too many files have changed in this diff Show More