[Merge] Merge from lp:calibre trunk, revison 6524

This commit is contained in:
Li Fanxi 2010-10-09 23:12:53 +08:00
commit 0d5782e3bf
225 changed files with 91152 additions and 51005 deletions

View File

@ -4,6 +4,243 @@
# for important features/bug fixes.
# Also, each release can have new and improved recipes.
- version: 0.7.23
date: 2010-10-08
new features:
- title: "Drag and drop to Tag Browser. You can use this to conveniently add tags, set series/publisher etc for a group of books"
- title: "Allow switching of library even when a device is connected"
- title: "Support for the PD Novel running Kobo"
- title: "Move check library integrity from preferences to drop down menu accessed by clicking arrow next to calibre icon"
- title: "Nicer, non-blocking update available notification"
- title: "E-book viewer: If you choose to remeber last used window size, the state of the Table of Contents view is also remembered"
tickets: [7082]
- title: "Allow moving as well as copying of books to another library"
- title: "Apple devices: Add support for plugboards"
- title: "Allow DJVU to be sent to the DR1000"
bug fixes:
- title: "Searching: Fix search expression parser to allow use of escaped double quotes in the search expression"
- title: "When saving cover images don't re-encode the image data unless absolutely neccessary. This prevents information loss due to JPEG re-compression"
- title: "Fix regression that broke setting of metadata for some MOBI/AZW/PRC files"
- title: "Fix regression in last release that could cause download of metadata for multiple files to only download the metadata for a few of them"
tickets: [7071]
- title: "MOBI Output: More tweaking of the margin handling to yield results closer to the input document."
- title: "Device drivers: Fix regression that could cause geenration of invalid metadata.calibre cache files"
- title: "Fix saving to disk with ISBN in filename"
tickets: [7090]
- title: "Fix another regression in the ISBNdb.com metadata download plugin"
- title: "Fix dragging to not interfere with multi-selection. Also dont allow drag and drop from the library to itself"
- title: "CHM input: handle another class of broken CHM files"
tickets: [7058]
new recipes:
- title: "Communications of the Association for Computing Machinery"
author: jonmisurda
- title: "Anand Tech"
author: "Oliver Niesner"
- title: "gsp.ro"
author: "bucsie"
- title: "Il Fatto Quotidiano"
author: "egilh"
- title: "Serbian Literature blog and Rusia Hoy"
author: "Darko Miletic"
- title: "Medscape"
author: "Tony Stegall"
improved recipes:
- The Age
- Australian
- Wiki news
- Times Online
- New Yorker
- Guardian
- Sueddeutsche
- HNA
- Revista Muy Interesante
- version: 0.7.22
date: 2010-10-03
new features:
- title: "Drag and drop books from your calibre library"
type: major
description: >
"You can now drag and drop books from your calibre library. You can drag them to the desktop or to a file explorer, to copy them to your computer. You can drag them to the
device icon in calibre to send them to the device. You can also drag and drop books from the device view in calibre to the calibre library icon or the operating
system to copy them from the device."
- title: "There were many minor bug fixes for various bugs caused by the major changes in 0.7.21. So if you have updated to 0.7.21, it is highly recommended you update to 0.7.22"
- title: "Driver for the VelocityMicro ebook reader device"
- title: "Add a tweak to control how articles in titles are processed during sorting"
- title: "Add a new format type 'device_db' to plugboards to control the metadata displayed in book lists on SONY devices."
bug fixes:
- title: "Fix ISBN not being read from filenames in 0.7.21"
tickets: [7054]
- title: "Fix instant Search for text not found causes unhandled exception when conversion jobs are running"
tickets: [7043]
- title: "Fix removing a publisher causes an error in 0.7.21"
tickets: [7046]
- title: "MOBI Output: Fix some images being distorted in 0.7.21"
tickets: [7049]
- title: "Fix regression that broke bulk conversion of books without covers in 0.7.21"
- title: "Fix regression that broke add and set_metadata commands in calibredb in 0.7.21"
- title: "Workaround for Qt bug in file open dialogs in linux that causes multiple file selection to ignore files with two or more spaces in the file name"
- title: "Conversion pipeline: Fix regression in 0.7.21 that broke conversion of LIT/EPUB documents that specified no title in their OPF files"
- title: "Fix regression that broke iPad driver in 0.7.21"
improved recipes:
- Washington Post
- version: 0.7.21
date: 2010-10-01
new features:
- title: "Automatic backup of the calibre metadata database"
type: major
description: >
"calibre now automatically backups up the metadata for each book in the library into an individual OPF file in that books' folder. This means that if the calibre metadata database is corrupted, for example by a hard disk failure, you can reconstruct it from these OPF files, without losing any metadata. For the moment, only the backup is implemented, restore will be implemented in the future. The backup happens automatically in the background while calibre is running. The first time you start calibre, all the books will need to be backed up, so you may notice calibre running a little slower than usual."
- title: "Virtual columns"
type: major
description: >
"You can now add virtual columns to the calibre book list. These are built fro other columns using templates and can be used to, for example, create columns to show the books isbn and avaialbale formats. You can do this by right clicking on a column header and select 'Add your own columns'"
- title: "calibre templates now much more powerful"
type: major
description: >
"The templates used in calibre in send to device and save to disk have now beome much ore powerful. They can use conditinal text and functions to transforms the replacement text. Also they now have access t metadata in user defined columns. For details see the tutorials section of the User Manual."
- title: "Metadata plugboards: Allow you to perform sophisticated transformations on the metadata of a book when exporting it from the calibre library."
type: major
description: >
"For example, you can add the series informtion to the title when sendig books to a device. This functionality is accessed from Preferences->Import/Export->Metadata plugboards"
- title: "User defined columns are now fully integrated into calibre"
type: major
description: >
"User defined columns can nw be used everywhere. In the content server, Search and Replace, to create ondevice collections, and in the save to disk and send to device templates for creating filenames. In addition, user defined metadata is saved to an read back from EPUB/OPF files."
- title: "Driver for the jetBook Mini"
- title: "Add tweaks to control which custom columns the content server displays."
- title: "Bulk downloading of metadata/covers now shows progress and can be canceled"
- title: "New plugin to download covers from douban.com. It is disabled by default and must be enabled via Preferences->Advanced->Plugins->Cover download plugins"
- title: "Add option to change titles to title case in the Bulk metadata edit dialog"
- title: "Add option to bulk metadata edit dialog to force series renumbering to start with a specified value"
bug fixes:
- title: "Fix various bugs that could lead to stale files being left in the calbre library when editing title/author metadata on windows"
- title: "Fix various regression in the preprocess and de-hyphenation code that broke conversion of some files, especially PDF ones."
- title: "Alex driver: Fix books not being placed in sub directories. Send covers. And allow sending of FB2"
tickets: [6956]
- title: "MOBI Output: Fix bug that could caused left margins in the MOBI file to have twice the size of the left margins in the input document, when viewed on the pathetic Kindle MOBI renderer"
- title: "MOBI Input: Interpret blockquotes as having a left margin of 2em not 1em to reflect recent Amazon practice"
- title: "MOBI Output: Remove transparencies from images. Pathetic Kindle MOBI renderer strikes again"
- title: "Revert removal of inline toc from news downloaded in MOBI format as this makes it unusable with the pathetic Kindle For PC application"
- title: "Content server: Remove special characters from filenames in download links to accomodate broken browsers like the one in the Kindle"
- title: "Conversion pipeline: When rescaling images, dont replace gif image data with jpeg data"
- title: "EPUB Input: Ignore OPF files in the EPUB whose names start with a period"
- title: "RTF Output: Handle a larger set of broken images in the input document"
tickets: [7003]
- title: "epub-fix: Handle dates before 1900"
tickets: [7002]
- title: "Welcome wizard: Prevent the user from choosing a non empty folder as her calibre library"
- title: "Automatically enable the Douban metadata download plugins if the user choose chinese as the interface language in the welcome wizard"
- title: "Linux DBUS notifier: Fix causing freezes on some DBUS implementations"
tickets: [6969]
- title: "Workaround for windows limitation when reading from network sockets. Should fix issues with large files in calibre libraries on network shares."
tickets: [3248]
new recipes:
- title: "BBC Sport"
author: "limawhiskey"
- title: "Revista Muy Interesante "
author: "Jefferson Frantz"
- title: "El Universo - Ecuador and Frederik Pohl's Blog"
author: "Darko Miletic"
- title: "Science News"
author: "Starson17"
- title: "Various Belgian news sources"
author: "Lionel Bergeret"
- title: "Oriental Daily"
author: "Larry Chan"
- title: "Rmf24 - Opinie"
author: "Tomasz Dlugosz"
- title: "Jerusalem Post - French and Howto Geek"
author: "Tony Stegall"
improved recipes:
- Peter Schiff
- Telegraph UK
- AJC
- Boortz
- Scientific American
- version: 0.7.20
date: 2010-09-24

4339
imgsrc/console.svg Normal file

File diff suppressed because it is too large Load Diff

After

Width:  |  Height:  |  Size: 113 KiB

File diff suppressed because it is too large Load Diff

Before

Width:  |  Height:  |  Size: 14 KiB

After

Width:  |  Height:  |  Size: 53 KiB

300
imgsrc/plugboard.svg Normal file
View File

@ -0,0 +1,300 @@
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!-- Created with Inkscape (http://www.inkscape.org/) -->
<svg
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:cc="http://creativecommons.org/ns#"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:svg="http://www.w3.org/2000/svg"
xmlns="http://www.w3.org/2000/svg"
xmlns:xlink="http://www.w3.org/1999/xlink"
xmlns:sodipodi="http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd"
xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"
width="128.00171"
height="128.00175"
id="svg2"
sodipodi:version="0.32"
inkscape:version="0.48.0 r9654"
sodipodi:docname="plugboard.svg"
inkscape:output_extension="org.inkscape.output.svg.inkscape"
inkscape:export-filename="C:\Dokumente und Einstellungen\Appel\Desktop\PlugboardIcon\plugboard2.png"
inkscape:export-xdpi="72.0466"
inkscape:export-ydpi="72.0466"
version="1.1">
<defs
id="defs4">
<linearGradient
id="linearGradient3176">
<stop
style="stop-color:#3a78be;stop-opacity:1;"
offset="0"
id="stop3178" />
<stop
style="stop-color:#6f9afa;stop-opacity:1;"
offset="1"
id="stop3180" />
</linearGradient>
<linearGradient
id="linearGradient3168">
<stop
style="stop-color:#3a78be;stop-opacity:1;"
offset="0"
id="stop3170" />
<stop
style="stop-color:#6f9afa;stop-opacity:1;"
offset="1"
id="stop3172" />
</linearGradient>
<linearGradient
inkscape:collect="always"
xlink:href="#linearGradient3168"
id="linearGradient3174"
x1="386.89221"
y1="703.53375"
x2="386.89221"
y2="252.50571"
gradientUnits="userSpaceOnUse" />
<linearGradient
inkscape:collect="always"
xlink:href="#linearGradient3176"
id="linearGradient3182"
x1="387.41043"
y1="501.67398"
x2="387.41043"
y2="252.02386"
gradientUnits="userSpaceOnUse" />
<linearGradient
inkscape:collect="always"
xlink:href="#linearGradient3176"
id="linearGradient3035"
gradientUnits="userSpaceOnUse"
x1="387.41043"
y1="501.67398"
x2="387.41043"
y2="252.02386" />
<linearGradient
inkscape:collect="always"
xlink:href="#linearGradient3168"
id="linearGradient3037"
gradientUnits="userSpaceOnUse"
x1="386.89221"
y1="703.53375"
x2="386.89221"
y2="252.50571" />
</defs>
<sodipodi:namedview
id="base"
pagecolor="#ffffff"
bordercolor="#666666"
borderopacity="1.0"
inkscape:pageopacity="0.0"
inkscape:pageshadow="2"
inkscape:zoom="1.0508882"
inkscape:cx="-90.42008"
inkscape:cy="71.977333"
inkscape:document-units="px"
inkscape:current-layer="layer1"
showguides="true"
inkscape:guide-bbox="true"
inkscape:window-width="1280"
inkscape:window-height="979"
inkscape:window-x="0"
inkscape:window-y="33"
showgrid="false"
fit-margin-top="0"
fit-margin-left="0"
fit-margin-right="0"
fit-margin-bottom="0"
inkscape:window-maximized="0">
<sodipodi:guide
orientation="vertical"
position="-153.50258,-506.94648"
id="guide3191" />
<sodipodi:guide
orientation="vertical"
position="274.4401,-506.94648"
id="guide3193" />
<sodipodi:guide
orientation="horizontal"
position="-323.06477,409.4968"
id="guide3195" />
<sodipodi:guide
orientation="horizontal"
position="-323.06477,211.67424"
id="guide3197" />
<sodipodi:guide
orientation="horizontal"
position="-323.06477,9.814489"
id="guide3199" />
</sodipodi:namedview>
<metadata
id="metadata7">
<rdf:RDF>
<cc:Work
rdf:about="">
<dc:format>image/svg+xml</dc:format>
<dc:type
rdf:resource="http://purl.org/dc/dcmitype/StillImage" />
</cc:Work>
</rdf:RDF>
</metadata>
<g
inkscape:label="Ebene 1"
inkscape:groupmode="layer"
id="layer1"
transform="translate(-323.06477,-417.41394)">
<g
id="g3015"
transform="matrix(0.20679483,0,0,0.21391708,307.0229,378.43143)">
<g
transform="translate(3.581054,-461.3231)"
id="g5213">
<path
sodipodi:type="arc"
style="fill:none;stroke:#ffffff;stroke-width:30;stroke-linecap:round;stroke-linejoin:round;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none;stroke-dashoffset:0"
id="path4201"
sodipodi:cx="129.19025"
sodipodi:cy="125.15305"
sodipodi:rx="82.08963"
sodipodi:ry="82.08963"
d="m 211.27988,125.15305 c 0,45.33685 -36.75278,82.08963 -82.08963,82.08963 -45.336854,0 -82.089634,-36.75278 -82.089634,-82.08963 0,-45.336848 36.75278,-82.089627 82.089634,-82.089627 45.33685,0 82.08963,36.752779 82.08963,82.089627 z"
transform="translate(41.899029,623.49247)" />
<path
sodipodi:type="arc"
style="fill:none;stroke:#ffffff;stroke-width:30;stroke-linecap:round;stroke-linejoin:round;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none;stroke-dashoffset:0"
id="path4203"
sodipodi:cx="129.19025"
sodipodi:cy="125.15305"
sodipodi:rx="82.08963"
sodipodi:ry="82.08963"
d="m 211.27988,125.15305 c 0,45.33685 -36.75278,82.08963 -82.08963,82.08963 -45.336854,0 -82.089634,-36.75278 -82.089634,-82.08963 0,-45.336848 36.75278,-82.089627 82.089634,-82.089627 45.33685,0 82.08963,36.752779 82.08963,82.089627 z"
transform="translate(41.899029,821.58422)" />
<path
d="m 211.27988,125.15305 c 0,45.33685 -36.75278,82.08963 -82.08963,82.08963 -45.336854,0 -82.089634,-36.75278 -82.089634,-82.08963 0,-45.336848 36.75278,-82.089627 82.089634,-82.089627 45.33685,0 82.08963,36.752779 82.08963,82.089627 z"
sodipodi:ry="82.08963"
sodipodi:rx="82.08963"
sodipodi:cy="125.15305"
sodipodi:cx="129.19025"
id="path4205"
style="fill:none;stroke:#ffffff;stroke-width:30;stroke-linecap:round;stroke-linejoin:round;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none;stroke-dashoffset:0"
sodipodi:type="arc"
transform="translate(41.899029,1019.6759)" />
<path
sodipodi:type="arc"
style="fill:none;stroke:#ffffff;stroke-width:30;stroke-linecap:round;stroke-linejoin:round;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none;stroke-dashoffset:0"
id="path4209"
sodipodi:cx="129.19025"
sodipodi:cy="125.15305"
sodipodi:rx="82.08963"
sodipodi:ry="82.08963"
d="m 211.27988,125.15305 c 0,45.33685 -36.75278,82.08963 -82.08963,82.08963 -45.336854,0 -82.089634,-36.75278 -82.089634,-82.08963 0,-45.336848 36.75278,-82.089627 82.089634,-82.089627 45.33685,0 82.08963,36.752779 82.08963,82.089627 z"
transform="translate(466.69074,623.49247)" />
<path
sodipodi:type="arc"
style="fill:none;stroke:#ffffff;stroke-width:30;stroke-linecap:round;stroke-linejoin:round;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none;stroke-dashoffset:0"
id="path4211"
sodipodi:cx="129.19025"
sodipodi:cy="125.15305"
sodipodi:rx="82.08963"
sodipodi:ry="82.08963"
d="m 211.27988,125.15305 c 0,45.33685 -36.75278,82.08963 -82.08963,82.08963 -45.336854,0 -82.089634,-36.75278 -82.089634,-82.08963 0,-45.336848 36.75278,-82.089627 82.089634,-82.089627 45.33685,0 82.08963,36.752779 82.08963,82.089627 z"
transform="translate(466.69074,821.58422)" />
<path
d="m 211.27988,125.15305 c 0,45.33685 -36.75278,82.08963 -82.08963,82.08963 -45.336854,0 -82.089634,-36.75278 -82.089634,-82.08963 0,-45.336848 36.75278,-82.089627 82.089634,-82.089627 45.33685,0 82.08963,36.752779 82.08963,82.089627 z"
sodipodi:ry="82.08963"
sodipodi:rx="82.08963"
sodipodi:cy="125.15305"
sodipodi:cx="129.19025"
id="path4213"
style="fill:none;stroke:#ffffff;stroke-width:30;stroke-linecap:round;stroke-linejoin:round;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none;stroke-dashoffset:0"
sodipodi:type="arc"
transform="translate(466.69074,1019.6759)" />
<path
style="fill:none;stroke:#ffffff;stroke-width:50;stroke-linecap:round;stroke-linejoin:round;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none"
d="m 168.66696,746.8288 152.06768,0 2.85473,123.09483 c 56.19521,-0.39952 52.33668,51.2956 53.28826,76.46761 1.35196,23.3996 12.75809,72.52216 -57.09457,73.61286 l 0.95158,127.8527 277.22073,0"
id="path4215"
sodipodi:nodetypes="ccccccc"
inkscape:connector-curvature="0" />
<path
style="fill:none;stroke:#ffffff;stroke-width:50;stroke-linecap:round;stroke-linejoin:round;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none"
d="m 168.66696,945.99709 278.56646,0 0,-199.65012 151.75839,0"
id="path4217"
inkscape:connector-curvature="0" />
</g>
<g
id="g3012">
<path
sodipodi:type="arc"
style="fill:none;stroke:#3a78be;stroke-width:30;stroke-linecap:round;stroke-linejoin:round;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none;stroke-dashoffset:0"
id="path2160"
sodipodi:cx="129.19025"
sodipodi:cy="125.15305"
sodipodi:rx="82.08963"
sodipodi:ry="82.08963"
d="m 211.27988,125.15305 c 0,45.33685 -36.75278,82.08963 -82.08963,82.08963 -45.336854,0 -82.089634,-36.75278 -82.089634,-82.08963 0,-45.336848 36.75278,-82.089627 82.089634,-82.089627 45.33685,0 82.08963,36.752779 82.08963,82.089627 z"
transform="translate(45.480079,154.16937)" />
</g>
<path
transform="translate(45.480079,352.26112)"
d="m 211.27988,125.15305 c 0,45.33685 -36.75278,82.08963 -82.08963,82.08963 -45.336854,0 -82.089634,-36.75278 -82.089634,-82.08963 0,-45.336848 36.75278,-82.089627 82.089634,-82.089627 45.33685,0 82.08963,36.752779 82.08963,82.089627 z"
sodipodi:ry="82.08963"
sodipodi:rx="82.08963"
sodipodi:cy="125.15305"
sodipodi:cx="129.19025"
id="path3140"
style="fill:none;stroke:#3a78be;stroke-width:30;stroke-linecap:round;stroke-linejoin:round;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none;stroke-dashoffset:0"
sodipodi:type="arc" />
<path
transform="translate(45.480079,550.35281)"
sodipodi:type="arc"
style="fill:none;stroke:#3a78be;stroke-width:30;stroke-linecap:round;stroke-linejoin:round;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none;stroke-dashoffset:0"
id="path3142"
sodipodi:cx="129.19025"
sodipodi:cy="125.15305"
sodipodi:rx="82.08963"
sodipodi:ry="82.08963"
d="m 211.27988,125.15305 c 0,45.33685 -36.75278,82.08963 -82.08963,82.08963 -45.336854,0 -82.089634,-36.75278 -82.089634,-82.08963 0,-45.336848 36.75278,-82.089627 82.089634,-82.089627 45.33685,0 82.08963,36.752779 82.08963,82.089627 z" />
<path
transform="translate(470.27179,154.16937)"
d="m 211.27988,125.15305 c 0,45.33685 -36.75278,82.08963 -82.08963,82.08963 -45.336854,0 -82.089634,-36.75278 -82.089634,-82.08963 0,-45.336848 36.75278,-82.089627 82.089634,-82.089627 45.33685,0 82.08963,36.752779 82.08963,82.089627 z"
sodipodi:ry="82.08963"
sodipodi:rx="82.08963"
sodipodi:cy="125.15305"
sodipodi:cx="129.19025"
id="path3151"
style="fill:none;stroke:#3a78be;stroke-width:30;stroke-linecap:round;stroke-linejoin:round;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none;stroke-dashoffset:0"
sodipodi:type="arc" />
<path
transform="translate(470.27179,352.26112)"
d="m 211.27988,125.15305 c 0,45.33685 -36.75278,82.08963 -82.08963,82.08963 -45.336854,0 -82.089634,-36.75278 -82.089634,-82.08963 0,-45.336848 36.75278,-82.089627 82.089634,-82.089627 45.33685,0 82.08963,36.752779 82.08963,82.089627 z"
sodipodi:ry="82.08963"
sodipodi:rx="82.08963"
sodipodi:cy="125.15305"
sodipodi:cx="129.19025"
id="path3153"
style="fill:none;stroke:#3a78be;stroke-width:30;stroke-linecap:round;stroke-linejoin:round;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none;stroke-dashoffset:0"
sodipodi:type="arc" />
<path
transform="translate(470.27179,550.35281)"
sodipodi:type="arc"
style="fill:none;stroke:#3a78be;stroke-width:30;stroke-linecap:round;stroke-linejoin:round;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none;stroke-dashoffset:0"
id="path3155"
sodipodi:cx="129.19025"
sodipodi:cy="125.15305"
sodipodi:rx="82.08963"
sodipodi:ry="82.08963"
d="m 211.27988,125.15305 c 0,45.33685 -36.75278,82.08963 -82.08963,82.08963 -45.336854,0 -82.089634,-36.75278 -82.089634,-82.08963 0,-45.336848 36.75278,-82.089627 82.089634,-82.089627 45.33685,0 82.08963,36.752779 82.08963,82.089627 z" />
<path
id="path3203"
d="m 172.24801,476.67399 278.56646,0 0,-199.65012 151.75839,0"
style="fill:none;stroke:url(#linearGradient3035);stroke-width:50;stroke-linecap:round;stroke-linejoin:round;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none"
inkscape:connector-curvature="0" />
<path
sodipodi:nodetypes="ccccccc"
id="path3201"
d="m 172.24801,277.5057 152.06768,0 2.85473,123.09483 c 45.72787,0.55206 48.53038,48.44087 53.28826,76.46761 -1.50277,23.3996 -4.37028,72.52219 -57.09457,73.61289 l 0.95158,127.85271 277.22073,0"
style="fill:none;stroke:url(#linearGradient3037);stroke-width:50;stroke-linecap:round;stroke-linejoin:round;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none"
inkscape:connector-curvature="0" />
</g>
</g>
</svg>

After

Width:  |  Height:  |  Size: 14 KiB

View File

@ -59,14 +59,48 @@ function render_book(book) {
title = title.slice(0, title.length-2);
title += '&nbsp;({0}&nbsp;MB)&nbsp;'.format(size);
}
if (tags) title += 'Tags=[{0}] '.format(tags);
title += '<span class="tagdata_short" style="display:all">'
if (tags) {
t = tags.split(':&:', 2);
m = parseInt(t[0]);
tall = t[1].split(',');
t = t[1].split(',', m);
if (tall.length > m) t[m] = '...'
title += 'Tags=[{0}] '.format(t.join(','));
}
custcols = book.attr("custcols").split(',')
for ( i = 0; i < custcols.length; i++) {
if (custcols[i].length > 0) {
vals = book.attr(custcols[i]).split(':#:', 2);
if (vals[0].indexOf('#T#') == 0) { //startswith
vals[0] = vals[0].substr(3, vals[0].length)
t = vals[1].split(':&:', 2);
m = parseInt(t[0]);
t = t[1].split(',', m);
if (t.length == m) t[m] = '...';
vals[1] = t.join(',');
}
title += '{0}=[{1}] '.format(vals[0], vals[1]);
}
}
title += '</span>'
title += '<span class="tagdata_long" style="display:none">'
if (tags) {
t = tags.split(':&:', 2);
title += 'Tags=[{0}] '.format(t[1]);
}
custcols = book.attr("custcols").split(',')
for ( i = 0; i < custcols.length; i++) {
if (custcols[i].length > 0) {
vals = book.attr(custcols[i]).split(':#:', 2);
if (vals[0].indexOf('#T#') == 0) { //startswith
vals[0] = vals[0].substr(3, vals[0].length)
vals[1] = (vals[1].split(':&:', 2))[1];
}
title += '{0}=[{1}] '.format(vals[0], vals[1]);
}
}
title += '</span>'
title += '<img style="display:none" alt="" src="get/cover/{0}" /></span>'.format(id);
title += '<div class="comments">{0}</div>'.format(comments)
// Render authors cell
@ -170,11 +204,15 @@ function fetch_library_books(start, num, timeout, sort, order, search) {
var cover = row.find('img').attr('src');
var collapsed = row.find('.comments').css('display') == 'none';
$("#book_list tbody tr * .comments").css('display', 'none');
$("#book_list tbody tr * .tagdata_short").css('display', 'inherit');
$("#book_list tbody tr * .tagdata_long").css('display', 'none');
$('#cover_pane').css('visibility', 'hidden');
if (collapsed) {
row.find('.comments').css('display', 'inherit');
$('#cover_pane img').attr('src', cover);
$('#cover_pane').css('visibility', 'visible');
row.find(".tagdata_short").css('display', 'none');
row.find(".tagdata_long").css('display', 'inherit');
}
});

View File

@ -13,6 +13,8 @@
font-size: 1.25em;
border: 1px solid black;
text-color: black;
text-decoration: none;
margin-right: 0.5em;
background-color: #ddd;
border-top: 1px solid ThreeDLightShadow;
border-right: 1px solid ButtonShadow;
@ -70,6 +72,7 @@ div.navigation {
padding-right: 0em;
overflow: hidden;
text-align: center;
text-decoration: none;
}
#logo {

View File

@ -83,6 +83,16 @@ title_series_sorting = 'library_order'
# strictly_alphabetic, it would remain "The Client".
save_template_title_series_sorting = 'library_order'
# Set the list of words that are to be considered 'articles' when computing the
# title sort strings. The list is a regular expression, with the articles
# separated by 'or' bars. Comparisons are case insensitive, and that cannot be
# changed. Changes to this tweak won't have an effect until the book is modified
# in some way. If you enter an invalid pattern, it is silently ignored.
# To disable use the expression: '^$'
# Default: '^(A|The|An)\s+'
title_sort_articles=r'^(A|The|An)\s+'
# Specify a folder that calibre should connect to at startup using
# connect_to_folder. This must be a full path to the folder. If the folder does
# not exist when calibre starts, it is ignored. If there are '\' characters in
@ -93,6 +103,37 @@ save_template_title_series_sorting = 'library_order'
auto_connect_to_folder = ''
# Specify renaming rules for sony collections. Collections on Sonys are named
# depending upon whether the field is standard or custom. A collection derived
# from a standard field is named for the value in that field. For example, if
# the standard 'series' column contains the name 'Darkover', then the series
# will be named 'Darkover'. A collection derived from a custom field will have
# the name of the field added to the value. For example, if a custom series
# column named 'My Series' contains the name 'Darkover', then the collection
# will be named 'Darkover (My Series)'. If two books have fields that generate
# the same collection name, then both books will be in that collection. This
# tweak lets you specify for a standard or custom field the value to be put
# inside the parentheses. You can use it to add a parenthetical description to a
# standard field, for example 'Foo (Tag)' instead of the 'Foo'. You can also use
# it to force multiple fields to end up in the same collection. For example, you
# could force the values in 'series', '#my_series_1', and '#my_series_2' to
# appear in collections named 'some_value (Series)', thereby merging all of the
# fields into one set of collections. The syntax of this tweak is
# {'field_lookup_name':'name_to_use', 'lookup_name':'name', ...}
# Example 1: I want three series columns to be merged into one set of
# collections. If the column lookup names are 'series', '#series_1' and
# '#series_2', and if I want nothing in the parenthesis, then the value to use
# in the tweak value would be:
# sony_collection_renaming_rules={'series':'', '#series_1':'', '#series_2':''}
# Example 2: I want the word '(Series)' to appear on collections made from
# series, and the word '(Tag)' to appear on collections made from tags. Use:
# sony_collection_renaming_rules={'series':'Series', 'tags':'Tag'}
# Example 3: I want 'series' and '#myseries' to be merged, and for the
# collection name to have '(Series)' appended. The renaming rule is:
# sony_collection_renaming_rules={'series':'Series', '#myseries':'Series'}
sony_collection_renaming_rules={}
# Create search terms to apply a query across several built-in search terms.
# Syntax: {'new term':['existing term 1', 'term 2', ...], 'new':['old'...] ...}
# Example: create the term 'myseries' that when used as myseries:foo would
@ -114,6 +155,24 @@ add_new_book_tags_when_importing_books = False
# Set the maximum number of tags to show per book in the content server
max_content_server_tags_shown=5
# Set custom metadata fields that the content server will or will not display.
# content_server_will_display is a list of custom fields to be displayed.
# content_server_wont_display is a list of custom fields not to be displayed.
# wont_display has priority over will_display.
# The special value '*' means all custom fields.
# Defaults:
# content_server_will_display = ['*']
# content_server_wont_display = ['']
# Examples:
# To display only the custom fields #mytags and #genre:
# content_server_will_display = ['#mytags', '#genre']
# content_server_wont_display = ['']
# To display all fields except #mycomments:
# content_server_will_display = ['*']
# content_server_wont_display['#mycomments']
content_server_will_display = ['*']
content_server_wont_display = ['']
# Set the maximum number of sort 'levels' that calibre will use to resort the
# library after certain operations such as searches or device insertion. Each

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.0 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 2.5 KiB

After

Width:  |  Height:  |  Size: 16 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 664 B

Binary file not shown.

After

Width:  |  Height:  |  Size: 625 B

Binary file not shown.

After

Width:  |  Height:  |  Size: 492 B

Binary file not shown.

After

Width:  |  Height:  |  Size: 696 B

Binary file not shown.

After

Width:  |  Height:  |  Size: 13 KiB

View File

@ -16,6 +16,7 @@ class AdvancedUserRecipe1282101454(BasicNewsRecipe):
title = 'The AJC'
timefmt = ' [%a,%d %B %Y %I:%M %p]'
__author__ = 'TonytheBookworm'
language = 'en'
description = 'News from Atlanta and USA'
publisher = 'The Atlanta Journal'
category = 'news, politics, USA'

View File

@ -0,0 +1,32 @@
__license__ = 'GPL v3'
__copyright__ = '2008, Kovid Goyal <kovid at kovidgoyal.net>'
'''
Fetch Anandtech.
'''
from calibre.web.feeds.news import BasicNewsRecipe
class anan(BasicNewsRecipe):
title = 'Anandtech'
description = 'comprehensive Hardware Tests'
__author__ = 'Oliver Niesner'
use_embedded_content = False
language = 'en'
timefmt = ' [%d %b %Y]'
max_articles_per_feed = 40
no_stylesheets = True
remove_javascript = True
encoding = 'utf-8'
remove_tags=[dict(name='a', attrs={'style':'width:110px; margin-top:0px;text-align:center;'}),
dict(name='a', attrs={'style':'width:110px; margin-top:0px; margin-right:20px;text-align:center;'})]
feeds = [ ('Anandtech', 'http://www.anandtech.com/rss/')]
def print_version(self,url):
return url.replace('/show/', '/print/')

View File

@ -0,0 +1,65 @@
__license__ = 'GPL v3'
__copyright__ = '2010, limawhiskey <limawhiskey at gmail.com>'
'''
news.bbc.co.uk/sport/
'''
import re
from calibre.web.feeds.recipes import BasicNewsRecipe
class BBC(BasicNewsRecipe):
title = 'BBC Sport'
__author__ = 'limawhiskey, Darko Miletic, Starson17'
description = 'Sports news from UK. A fast version that does not download pictures'
oldest_article = 2
max_articles_per_feed = 100
no_stylesheets = True
use_embedded_content = False
encoding = 'utf8'
publisher = 'BBC'
category = 'sport, news, UK, world'
language = 'en_GB'
publication_type = 'newsportal'
extra_css = ' body{ font-family: Verdana,Helvetica,Arial,sans-serif } .introduction{font-weight: bold} .story-feature{display: block; padding: 0; border: 1px solid; width: 40%; font-size: small} .story-feature h2{text-align: center; text-transform: uppercase} '
preprocess_regexps = [(re.compile(r'<!--.*?-->', re.DOTALL), lambda m: '')]
conversion_options = {
'comments' : description
,'tags' : category
,'language' : language
,'publisher' : publisher
,'linearize_tables': True
}
keep_only_tags = [
dict(name='div', attrs={'class':['ds','mxb']}),
dict(attrs={'class':['story-body','storybody']})
]
remove_tags = [
dict(name='div', attrs={'class':['storyextra', 'share-help', 'embedded-hyper', \
'story-feature wide ', 'story-feature narrow', 'cap', 'caption', 'q1', 'sihf', \
'mva', 'videoInStoryC', 'sharesb', 'mvtb']}),
dict(name=['img']), dict(name=['br'])
]
remove_attributes = ['width','height']
feeds = [
('Sport Front Page', 'http://newsrss.bbc.co.uk/rss/sportonline_uk_edition/front_page/rss.xml'),
('Football', 'http://newsrss.bbc.co.uk/rss/sportonline_uk_edition/football/rss.xml'),
('Cricket', 'http://newsrss.bbc.co.uk/rss/sportonline_uk_edition/cricket/rss.xml'),
('Formula 1', 'http://newsrss.bbc.co.uk/rss/sportonline_uk_edition/motorsport/formula_one/rss.xml'),
('Commonwealth Games', 'http://newsrss.bbc.co.uk/rss/sportonline_uk_edition/commonwealth_games/delhi_2010/rss.xml'),
('Golf', 'http://newsrss.bbc.co.uk/rss/sportonline_uk_edition/golf/rss.xml'),
('Rugby Union', 'http://newsrss.bbc.co.uk/rss/sportonline_uk_edition/rugby_union/rss.xml'),
('Rugby League', 'http://newsrss.bbc.co.uk/rss/sportonline_uk_edition/rugby_league/rss.xml'),
('Tennis', 'http://newsrss.bbc.co.uk/rss/sportonline_uk_edition/tennis/rss.xml'),
('Motorsport', 'http://newsrss.bbc.co.uk/rss/sportonline_uk_edition/motorsport/rss.xml'),
('Boxing', 'http://newsrss.bbc.co.uk/rss/sportonline_uk_edition/boxing/rss.xml'),
('Athletics', 'http://newsrss.bbc.co.uk/rss/sportonline_uk_edition/athletics/rss.xml'),
('Snooker', 'http://newsrss.bbc.co.uk/rss/sportonline_uk_edition/other_sports/snooker/rss.xml'),
('Horse Racing', 'http://newsrss.bbc.co.uk/rss/sportonline_uk_edition/other_sports/horse_racing/rss.xml'),
('Cycling', 'http://newsrss.bbc.co.uk/rss/sportonline_uk_edition/other_sports/cycling/rss.xml'),
('Disability Sport', 'http://newsrss.bbc.co.uk/rss/sportonline_uk_edition/other_sports/disability_sport/rss.xml'),
('Other Sport', 'http://newsrss.bbc.co.uk/rss/sportonline_uk_edition/other_sports/rss.xml'),
('Olympics 2012', 'http://newsrss.bbc.co.uk/rss/sportonline_uk_edition/olympics/london_2012/rss.xml'),
]

View File

@ -0,0 +1,37 @@
import datetime
from calibre.web.feeds.news import BasicNewsRecipe
class AdvancedUserRecipe1286242553(BasicNewsRecipe):
title = u'CACM'
oldest_article = 7
max_articles_per_feed = 100
needs_subscription = True
feeds = [(u'CACM', u'http://cacm.acm.org/magazine.rss')]
language = 'en'
__author__ = 'jonmisurda'
no_stylesheets = True
remove_tags = [
dict(name='div', attrs={'class':['FeatureBox', 'ArticleComments', 'SideColumn', \
'LeftColumn', 'RightColumn', 'SiteSearch', 'MainNavBar','more', 'SubMenu', 'inner']})
]
cover_url_pattern = 'http://cacm.acm.org/magazines/%d/%d'
def get_browser(self):
br = BasicNewsRecipe.get_browser()
if self.username is not None and self.password is not None:
br.open('https://cacm.acm.org/login')
br.select_form(nr=1)
br['current_member[user]'] = self.username
br['current_member[passwd]'] = self.password
br.submit()
return br
def get_cover_url(self):
now = datetime.datetime.now()
cover_url = None
soup = self.index_to_soup(self.cover_url_pattern % (now.year, now.month))
cover_item = soup.find('img',attrs={'alt':'magazine cover image'})
if cover_item:
cover_url = cover_item['src']
return cover_url

View File

@ -26,7 +26,7 @@ class AdvancedUserRecipe1278162597(BasicNewsRecipe):
remove_javascript = True
use_embedded_content = False
no_stylesheets = True
language = 'zh-cn'
language = 'zh_CN'
encoding = 'gb2312'
conversion_options = {'linearize_tables':True}

View File

@ -0,0 +1,40 @@
#!/usr/bin/env python
__license__ = 'GPL v3'
__copyright__ = '2008, Lionel Bergeret <lbergeret at gmail.com>'
'''
cinebel.be
'''
from calibre.web.feeds.news import BasicNewsRecipe
class Cinebel(BasicNewsRecipe):
title = u'Cinebel'
__author__ = u'Lionel Bergeret'
description = u'Cinema news from Belgium in French'
publisher = u'cinebel.be'
category = 'news, cinema, movie, Belgium'
oldest_article = 3
encoding = 'utf8'
language = 'fr_BE'
max_articles_per_feed = 20
no_stylesheets = True
use_embedded_content = False
timefmt = ' [%d %b %Y]'
keep_only_tags = [
dict(name = 'span', attrs = {'class': 'movieMainTitle'})
,dict(name = 'div', attrs = {'id': 'filmPoster'})
,dict(name = 'div', attrs = {'id': 'filmDefinition'})
,dict(name = 'div', attrs = {'id': 'synopsis'})
]
feeds = [
(u'Les sorties de la semaine' , u'http://www.cinebel.be/Servlets/RssServlet?languageCode=fr&rssType=0' )
,(u'Top 10' , u'http://www.cinebel.be/Servlets/RssServlet?languageCode=fr&rssType=2' )
]
def get_cover_url(self):
cover_url = 'http://www.cinebel.be/portal/resources/common/logo_index.gif'
return cover_url

View File

@ -0,0 +1,39 @@
#!/usr/bin/env python
__license__ = 'GPL v3'
__copyright__ = '2008, Lionel Bergeret <lbergeret at gmail.com>'
'''
dhnet.be
'''
from calibre import strftime
from calibre.web.feeds.news import BasicNewsRecipe
class DHNetBe(BasicNewsRecipe):
title = u'La Derniere Heure'
__author__ = u'Lionel Bergeret'
description = u'News from Belgium in French'
publisher = u'dhnet.be'
category = 'news, Belgium'
oldest_article = 3
language = 'fr_BE'
max_articles_per_feed = 20
no_stylesheets = True
use_embedded_content = False
timefmt = ' [%d %b %Y]'
keep_only_tags = [
dict(name = 'div', attrs = {'id': 'articleText'})
,dict(name = 'div', attrs = {'id': 'articlePicureAndLinks'})
]
feeds = [
(u'La Une' , u'http://www.dhnet.be/rss' )
,(u'La Une Sports' , u'http://www.dhnet.be/rss/dhsports/' )
,(u'La Une Info' , u'http://www.dhnet.be/rss/dhinfos/' )
]
def get_cover_url(self):
cover_url = strftime('http://pdf-online.dhnet.be/pdfonline/image/%Y%m%d/dh_%Y%m%d_nam_infoge_001.pdf.L.jpg')
return cover_url

View File

@ -0,0 +1,63 @@
__license__ = 'GPL v3'
__copyright__ = '2010, Darko Miletic <darko.miletic at gmail.com>'
'''
eluniverso.com
'''
from calibre.web.feeds.news import BasicNewsRecipe
class ElUniverso_Ecuador(BasicNewsRecipe):
title = 'El Universo - Ecuador'
__author__ = 'Darko Miletic'
description = 'Noticias del Ecuador y el resto del mundo'
publisher = 'El Universo'
category = 'news, politics, Ecuador'
oldest_article = 2
max_articles_per_feed = 200
no_stylesheets = True
encoding = 'utf8'
use_embedded_content = False
language = 'es'
remove_empty_feeds = True
publication_type = 'newspaper'
masthead_url = 'http://servicios2.eluniverso.com/versiones/v1/img/Hd/lg_ElUniverso.gif'
extra_css = """
body{font-family: Verdana,Arial,Helvetica,sans-serif; color: #333333 }
h2{font-family: Georgia,"Times New Roman",Times,serif; color: #1B2D60}
"""
conversion_options = {
'comment' : description
, 'tags' : category
, 'publisher' : publisher
, 'language' : language
}
remove_tags = [
dict(attrs={'class':['flechs','multiBox','colRecursos']})
,dict(name=['meta','link','embed','object','iframe','base'])
]
keep_only_tags = [dict(attrs={'class':'Nota'})]
remove_tags_after = dict(attrs={'id':'TextoPrint'})
remove_tags_before = dict(attrs={'id':'FechaPrint'})
feeds = [
(u'Portada' , u'http://www.eluniverso.com/rss/portada.xml' )
,(u'Politica' , u'http://www.eluniverso.com/rss/politica.xml' )
,(u'Economia' , u'http://www.eluniverso.com/rss/economia.xml' )
,(u'Sucesos' , u'http://www.eluniverso.com/rss/sucesos.xml' )
,(u'Migracion' , u'http://www.eluniverso.com/rss/migrantes_tema.xml' )
,(u'El Pais' , u'http://www.eluniverso.com/rss/elpais.xml' )
,(u'Internacionales' , u'http://www.eluniverso.com/rss/internacionales.xml' )
,(u'Deportes' , u'http://www.eluniverso.com/rss/deportes.xml' )
,(u'Gran Guayaquill' , u'http://www.eluniverso.com/rss/gran_guayaquil.xml' )
,(u'Entretenimiento' , u'http://www.eluniverso.com/rss/arteyespectaculos.xml' )
,(u'Vida' , u'http://www.eluniverso.com/rss/tuvida.xml' )
,(u'Opinion' , u'http://www.eluniverso.com/rss/opinion.xml' )
]
def preprocess_html(self, soup):
for item in soup.findAll(style=True):
del item['style']
return soup

View File

@ -0,0 +1,20 @@
from calibre.web.feeds.news import BasicNewsRecipe
class AdvancedUserRecipe1286351181(BasicNewsRecipe):
title = u'gsp.ro'
__author__ = 'bucsie'
oldest_article = 2
max_articles_per_feed = 100
language='ro'
cover_url ='http://www.gsp.ro/images/sigla_rosu.jpg'
remove_tags = [
dict(name='div', attrs={'class':['related_articles', 'articol_noteaza straight_line dotted_line_top', 'comentarii','mai_multe_articole']}),
dict(name='div', attrs={'id':'icons'})
]
remove_tags_after = dict(name='div', attrs={'id':'adoceanintactrovccmgpmnyt'})
feeds = [(u'toate stirile', u'http://www.gsp.ro/index.php?section=section&screen=rss')]
def print_version(self, url):
return 'http://www1.gsp.ro/print/' + url[(url.rindex('/')+1):]

View File

@ -8,10 +8,16 @@ www.guardian.co.uk
'''
from calibre import strftime
from calibre.web.feeds.news import BasicNewsRecipe
from datetime import date
class Guardian(BasicNewsRecipe):
title = u'The Guardian'
title = u'The Guardian / The Observer'
if date.today().weekday() == 6:
base_url = "http://www.guardian.co.uk/theobserver"
else:
base_url = "http://www.guardian.co.uk/theguardian"
__author__ = 'Seabound and Sujata Raman'
language = 'en_GB'
@ -19,6 +25,10 @@ class Guardian(BasicNewsRecipe):
max_articles_per_feed = 100
remove_javascript = True
# List of section titles to ignore
# For example: ['Sport']
ignore_sections = []
timefmt = ' [%a, %d %b %Y]'
keep_only_tags = [
dict(name='div', attrs={'id':["content","article_header","main-article-info",]}),
@ -28,6 +38,7 @@ class Guardian(BasicNewsRecipe):
dict(name='div', attrs={'id':["article-toolbox","subscribe-feeds",]}),
dict(name='ul', attrs={'class':["pagination"]}),
dict(name='ul', attrs={'id':["content-actions"]}),
#dict(name='img'),
]
use_embedded_content = False
@ -43,18 +54,6 @@ class Guardian(BasicNewsRecipe):
#match-stats-summary{font-size:small; font-family:Arial,Helvetica,sans-serif;font-weight:normal;}
'''
feeds = [
('Front Page', 'http://www.guardian.co.uk/rss'),
('Business', 'http://www.guardian.co.uk/business/rss'),
('Sport', 'http://www.guardian.co.uk/sport/rss'),
('Culture', 'http://www.guardian.co.uk/culture/rss'),
('Money', 'http://www.guardian.co.uk/money/rss'),
('Life & Style', 'http://www.guardian.co.uk/lifeandstyle/rss'),
('Travel', 'http://www.guardian.co.uk/travel/rss'),
('Environment', 'http://www.guardian.co.uk/environment/rss'),
('Comment','http://www.guardian.co.uk/commentisfree/rss'),
]
def get_article_url(self, article):
url = article.get('guid', None)
if '/video/' in url or '/flyer/' in url or '/quiz/' in url or \
@ -76,7 +75,8 @@ class Guardian(BasicNewsRecipe):
return soup
def find_sections(self):
soup = self.index_to_soup('http://www.guardian.co.uk/theguardian')
# soup = self.index_to_soup("http://www.guardian.co.uk/theobserver")
soup = self.index_to_soup(self.base_url)
# find cover pic
img = soup.find( 'img',attrs ={'alt':'Guardian digital edition'})
if img is not None:
@ -113,13 +113,10 @@ class Guardian(BasicNewsRecipe):
try:
feeds = []
for title, href in self.find_sections():
feeds.append((title, list(self.find_articles(href))))
if not title in self.ignore_sections:
feeds.append((title, list(self.find_articles(href))))
return feeds
except:
raise NotImplementedError
def postprocess_html(self,soup,first):
return soup.findAll('html')[0]

View File

@ -30,21 +30,33 @@ class hnaDe(BasicNewsRecipe):
dict(id='superbanner'),
dict(id='navigation'),
dict(id='skyscraper'),
dict(id='idNavigationWrap'),
dict(id='idHeaderSearchForm'),
dict(id='idLoginBarWrap'),
dict(id='idAccountButtons'),
dict(id='idHeadButtons'),
dict(id='idBoxesWrap'),
dict(id=''),
dict(name='span'),
dict(name='ul', attrs={'class':'linklist'}),
dict(name='a', attrs={'href':'#'}),
dict(name='div', attrs={'class':'hlist'}),
dict(name='li', attrs={'class':'idButton idIsLoginGroup idHeaderRegister '}),
dict(name='li', attrs={'class':'idVideoBar idFirst'}),
dict(name='li', attrs={'class':'idSetStartPageLink idLast'}),
dict(name='li', attrs={'class':'idKinderNetzBar idLast'}),
dict(name='li', attrs={'class':'idFotoBar '}),
dict(name='div', attrs={'class':'subc noprint'}),
dict(name='div', attrs={'class':'idBreadcrumb'}),
dict(name='div', attrs={'class':'idLay idAdvertising idClStandard '}),
dict(name='span', attrs={'class':'idHeadLineIntro'}),
dict(name='p', attrs={'class':'breadcrumb'}),
dict(name='a', attrs={'style':'cursor:hand'}),
dict(name='p', attrs={'class':'h5'})]
dict(name='p', attrs={'class':'h5'}),
dict(name='p', attrs={'class':'idMoreEnd'})]
#remove_tags_after = [dict(name='div', attrs={'class':'rahmenbreaking'})]
remove_tags_after = [dict(name='a', attrs={'href':'#'})]
remove_tags_after = [dict(name='p', attrs={'class':'idMoreEnd'})]
feeds = [ ('hna_soehre', 'http://feeds2.feedburner.com/hna/soehre'),
('hna_kassel', 'http://feeds2.feedburner.com/hna/kassel') ]

View File

@ -0,0 +1,30 @@
from calibre.web.feeds.news import BasicNewsRecipe
class AdvancedUserRecipe1286477122(BasicNewsRecipe):
title = u'Il Fatto Quotidiano'
oldest_article = 7
max_articles_per_feed = 25
language = 'it'
__author__ = 'egilh'
feeds = [
(u'Politica & Palazzo', u'http://www.ilfattoquotidiano.it/category/politica-palazzo/feed/'),
(u'Giustizia & impunit\xe0', u'http://www.ilfattoquotidiano.it/category/giustizia-impunita/feed/'),
(u'Media & regime', u'http://www.ilfattoquotidiano.it/category/media-regime/feed/'),
(u'Economia & Lobby', u'http://www.ilfattoquotidiano.it/category/economia-lobby/feed/'),
(u'Lavoro & precari', u'http://www.ilfattoquotidiano.it/category/lavoro-precari/feed/'),
(u'Ambiente & Veleni', u'http://www.ilfattoquotidiano.it/category/ambiente-veleni/feed/'),
(u'Sport & miliardi', u'http://www.ilfattoquotidiano.it/category/sport-miliardi/feed/'),
(u'Cronaca', u'http://www.ilfattoquotidiano.it/category/cronaca/feed/'),
(u'Mondo', u'http://www.ilfattoquotidiano.it/category/mondo/feed/'),
(u'Societ\xe0', u'http://www.ilfattoquotidiano.it/category/societa/feed/'),
(u'Scuola', u'http://www.ilfattoquotidiano.it/category/scuola/feed/'),
(u'Tecno', u'http://www.ilfattoquotidiano.it/category/tecno/feed/'),
(u'Terza pagina', u'http://www.ilfattoquotidiano.it/category/terza-pagina/feed/'),
(u'Piacere quotidiano', u'http://www.ilfattoquotidiano.it/category/piacere-quotidiano/feed/'),
(u'Cervelli in fuga', u'http://www.ilfattoquotidiano.it/category/cervelli-in-fuga/feed/'),
(u'Documentati!', u'http://www.ilfattoquotidiano.it/category/documentati/feed/'),
(u'Misfatto', u'http://www.ilfattoquotidiano.it/category/misfatto/feed/')
]

View File

@ -11,6 +11,7 @@ class AdvancedUserRecipe1283666183(BasicNewsRecipe):
title = u'Journal Gazette Ft. Wayne IN'
__author__ = 'cynvision'
oldest_article = 1
language = 'en'
max_articles_per_feed = 8
no_stylesheets = True
remove_javascript = True

View File

@ -0,0 +1,43 @@
#!/usr/bin/env python
__license__ = 'GPL v3'
__copyright__ = '2008, Lionel Bergeret <lbergeret at gmail.com>'
'''
lalibre.be
'''
from calibre import strftime
from calibre.web.feeds.news import BasicNewsRecipe
class LaLibre(BasicNewsRecipe):
title = u'La Libre Belgique'
__author__ = u'Lionel Bergeret'
description = u'News from Belgium in French'
publisher = u'lalibre.be'
category = 'news, Belgium'
oldest_article = 3
language = 'fr_BE'
max_articles_per_feed = 20
no_stylesheets = True
use_embedded_content = False
timefmt = ' [%d %b %Y]'
keep_only_tags = [
dict(name = 'div', attrs = {'id': 'articleHat'})
,dict(name = 'p', attrs = {'id': 'publicationDate'})
,dict(name = 'div', attrs = {'id': 'articleText'})
]
feeds = [
(u'L\'actu' , u'http://www.lalibre.be/rss/?section=10' )
,(u'Culture' , u'http://www.lalibre.be/rss/?section=5' )
,(u'Economie' , u'http://www.lalibre.be/rss/?section=3' )
,(u'Libre Entreprise' , u'http://www.lalibre.be/rss/?section=904' )
,(u'Sports' , u'http://www.lalibre.be/rss/?section=2' )
,(u'Societe' , u'http://www.lalibre.be/rss/?section=12' )
]
def get_cover_url(self):
cover_url = strftime('http://pdf-online.lalibre.be/pdfonline/image/%Y%m%d/llb_%Y%m%d_nam_libre_001.pdf.L.jpg')
return cover_url

View File

@ -0,0 +1,54 @@
#!/usr/bin/env python
__license__ = 'GPL v3'
__copyright__ = '2008, Lionel Bergeret <lbergeret at gmail.com>'
'''
lameuse.be
'''
from calibre import strftime
from calibre.web.feeds.news import BasicNewsRecipe
class LaMeuse(BasicNewsRecipe):
title = u'La Meuse'
__author__ = u'Lionel Bergeret'
description = u'News from Belgium in French'
publisher = u'lameuse.be'
category = 'news, Belgium'
oldest_article = 3
encoding = 'utf8'
language = 'fr_BE'
max_articles_per_feed = 20
no_stylesheets = True
use_embedded_content = False
timefmt = ' [%d %b %Y]'
keep_only_tags = [
dict(name = 'div', attrs = {'id': 'article'})
]
remove_tags = [
dict(name = 'div', attrs = {'class': 'sb-group'})
,dict(name = 'div', attrs = {'id': 'share'})
,dict(name = 'div', attrs = {'id': 'commentaires'})
]
feeds = [
(u'Actualite', u'http://www.lameuse.be/services/fils_rss/actualite/index.xml' )
,(u'Belgique', u'http://www.lameuse.be/services/fils_rss/actualite/belgique/index.xml' )
,(u'Monde', u'http://www.lameuse.be/services/fils_rss/actualite/monde/index.xml' )
,(u'Societe', u'http://www.lameuse.be/services/fils_rss/actualite/societe/index.xml' )
,(u'Faits Divers', u'http://www.lameuse.be/services/fils_rss/actualite/faits_divers/index.xml' )
,(u'Economie', u'http://www.lameuse.be/services/fils_rss/actualite/economie/index.xml' )
,(u'Science', u'http://www.lameuse.be/services/fils_rss/actualite/science/index.xml' )
,(u'Sante', u'http://www.lameuse.be/services/fils_rss/actualite/sante/index.xml' )
,(u'Insolite', u'http://www.lameuse.be/services/fils_rss/magazine/insolite/index.xml' )
,(u'Cinema', u'http://www.lameuse.be/services/fils_rss/culture/cinema/index.xml' )
,(u'Musique', u'http://www.lameuse.be/services/fils_rss/culture/musique/index.xml' )
,(u'Livres', u'http://www.lameuse.be/services/fils_rss/culture/livres/index.xml' )
]
def get_cover_url(self):
cover_url = strftime('http://pdf.lameuse.be/pdf/lameuse_%Y-%m-%d_LIEG_ACTUALITE_1.PDF')
return cover_url

View File

@ -0,0 +1,40 @@
#!/usr/bin/env python
__license__ = 'GPL v3'
__copyright__ = '2008, Lionel Bergeret <lbergeret at gmail.com>'
'''
lavenir.net
'''
from calibre.web.feeds.news import BasicNewsRecipe
class LAvenir(BasicNewsRecipe):
title = u'L\'Avenir'
__author__ = u'Lionel Bergeret'
description = u'News from Belgium in French'
publisher = u'lavenir.net'
category = 'news, Belgium'
oldest_article = 3
encoding = 'utf8'
language = 'fr_BE'
max_articles_per_feed = 20
no_stylesheets = True
use_embedded_content = False
timefmt = ' [%d %b %Y]'
keep_only_tags = [
dict(name = 'div', attrs = {'class': 'photo'})
,dict(name = 'p', attrs = {'class': 'intro'})
,dict(name = 'div', attrs = {'class': 'article-body'})
]
feeds = [
(u'Belgique' , u'http://www.lavenir.net/rss.aspx?foto=1&intro=1&section=info&info=df156511-c24f-4f21-81c3-a5d439a9cf4b' )
,(u'Monde' , u'http://www.lavenir.net/rss.aspx?foto=1&intro=1&section=info&info=1642237c-66b9-4e8a-a8c1-288d61fefe7e' )
,(u'Societe' , u'http://www.lavenir.net/rss.aspx?foto=1&intro=1&section=info&info=12e1a2f4-7e03-4cf1-afec-016869072317' )
]
def get_cover_url(self):
cover_url = 'http://www.lavenir.net/extra/Static/journal/Pdf/1/UNE_Nationale.PDF'
return cover_url

View File

@ -0,0 +1,48 @@
#!/usr/bin/env python
__license__ = 'GPL v3'
__copyright__ = '2008, Lionel Bergeret <lbergeret at gmail.com>'
'''
lesoir.be
'''
from calibre import strftime
from calibre.web.feeds.news import BasicNewsRecipe
class LeSoirBe(BasicNewsRecipe):
title = u'Le Soir'
__author__ = u'Lionel Bergeret'
description = u'News from Belgium in French'
publisher = u'lesoir.be'
category = 'news, Belgium'
oldest_article = 3
language = 'fr_BE'
max_articles_per_feed = 20
no_stylesheets = True
use_embedded_content = False
timefmt = ' [%d %b %Y]'
keep_only_tags = [
dict(name = 'div', attrs = {'id': 'story_head'})
,dict(name = 'div', attrs = {'id': 'story_body'})
]
remove_tags = [
dict(name='form', attrs={'id':'story_actions'})
,dict(name='div', attrs={'id':'sb-share'})
,dict(name='div', attrs={'id':'sb-subscribe'})
]
feeds = [
(u'Belgique' , u'http://www.lesoir.be/actualite/belgique/rss.xml' )
,(u'France' , u'http://www.lesoir.be/actualite/france/rss.xml' )
,(u'Monde' , u'http://www.lesoir.be/actualite/monde/rss.xml' )
,(u'Regions' , u'http://www.lesoir.be/regions/rss.xml' )
,(u'Vie du Net' , u'http://www.lesoir.be/actualite/vie_du_net/rss.xml' )
,(u'Petite Gazette' , u'http://www.lesoir.be/actualite/sciences/rss.xml' )
]
def get_cover_url(self):
cover_url = strftime( 'http://pdf.lesoir.be/pdf/%Y-%m-%d_BRUX_UNE_1.PDF')
return cover_url

View File

@ -0,0 +1,64 @@
#!/usr/bin/env python
__license__ = 'GPL v3'
__author__ = 'Tony Stegall'
__copyright__ = '2010, Tony Stegall or Tonythebookworm on mobileread.com'
__version__ = '1'
__date__ = '01, October 2010'
__docformat__ = 'English'
from calibre.web.feeds.recipes import BasicNewsRecipe
class MedScrape(BasicNewsRecipe):
title = 'MedScape'
__author__ = 'Tony Stegall'
description = 'Nursing News'
language = 'en'
timefmt = ' [%a, %d %b, %Y]'
needs_subscription = True
masthead_url = 'http://images.medscape.com/pi/global/header/sp/bg-sp-medscape.gif'
no_stylesheets = True
remove_javascript = True
conversion_options = {'linearize_tables' : True}
extra_css = '''
h1{font-family:Arial,Helvetica,sans-serif; font-weight:bold;font-size:large;}
p.authors{text-align:right; font-size:small;margin-top:0px;margin-bottom: 0px;}
p.postingdate{text-align:right; font-size:small;margin-top:0px;margin-bottom: 0px;}
h2{text-align:right; font-size:small;margin-top:0px;margin-bottom: 0px;}
p{font-family:Helvetica,Arial,sans-serif;font-size:small;}
'''
remove_tags = [dict(name='div', attrs={'class':['closewindow2']}),
dict(name='div', attrs={'id': ['basicheaderlinks']})
]
def get_browser(self):
br = BasicNewsRecipe.get_browser()
if self.username is not None and self.password is not None:
br.open('https://profreg.medscape.com/px/getlogin.do')
br.select_form(name='LoginForm')
br['userId'] = self.username
br['password'] = self.password
br.submit()
return br
feeds = [
('MedInfo', 'http://www.medscape.com/cx/rssfeeds/2685.xml'),
]
def print_version(self,url):
#the original url is: http://www.medscape.com/viewarticle/728955?src=rss
#the print url is: http://www.medscape.com/viewarticle/728955_print
print_url = url.partition('?')[0] +'_print'
#print 'the printable version is: ',print_url
return print_url
def preprocess_html(self, soup):
for item in soup.findAll(attrs={'style':True}):
del item['style']
return soup

View File

@ -1,50 +1,57 @@
#!/usr/bin/env python
__license__ = 'GPL v3'
__copyright__ = '2008-2009, Darko Miletic <darko.miletic at gmail.com>'
__copyright__ = '2008-2010, Darko Miletic <darko.miletic at gmail.com>'
'''
newyorker.com
'''
from calibre.web.feeds.news import BasicNewsRecipe
from calibre.ebooks.BeautifulSoup import Tag
class NewYorker(BasicNewsRecipe):
title = 'The New Yorker'
__author__ = 'Darko Miletic'
description = 'The best of US journalism'
oldest_article = 15
language = 'en'
language = 'en'
max_articles_per_feed = 100
no_stylesheets = True
use_embedded_content = False
publisher = 'Conde Nast Publications'
category = 'news, politics, USA'
encoding = 'cp1252'
publication_type = 'magazine'
masthead_url = 'http://www.newyorker.com/css/i/hed/logo.gif'
extra_css = """
body {font-family: "Times New Roman",Times,serif}
.articleauthor{color: #9F9F9F; font-family: Arial, sans-serif; font-size: small; text-transform: uppercase}
.rubric{color: #CD0021; font-family: Arial, sans-serif; font-size: small; text-transform: uppercase}
"""
keep_only_tags = [dict(name='div', attrs={'id':'printbody'})]
remove_tags_after = dict(name='div',attrs={'id':'articlebody'})
remove_tags = [
dict(name='div', attrs={'class':['utils','articleRailLinks','icons'] })
,dict(name='link')
]
conversion_options = {
'comment' : description
, 'tags' : category
, 'publisher' : publisher
, 'language' : language
}
feeds = [(u'The New Yorker', u'http://feeds.newyorker.com/services/rss/feeds/everything.xml')]
keep_only_tags = [dict(name='div', attrs={'id':['articleheads','articleRail','articletext','photocredits']})]
remove_tags = [
dict(name=['meta','iframe','base','link','embed','object'])
,dict(name='div', attrs={'class':['utils','articleRailLinks','icons'] })
]
remove_attributes = ['lang']
feeds = [(u'The New Yorker', u'http://feeds.newyorker.com/services/rss/feeds/everything.xml')]
def print_version(self, url):
return url + '?printable=true'
def get_article_url(self, article):
return article.get('guid', None)
def image_url_processor(self, baseurl, url):
return url.strip()
def get_cover_url(self):
cover_url = None
soup = self.index_to_soup('http://www.newyorker.com/magazine/toc/')
cover_item = soup.find('img',attrs={'id':'inThisIssuePhoto'})
if cover_item:
cover_url = 'http://www.newyorker.com' + cover_item['src'].strip()
return cover_url
def postprocess_html(self, soup, x):
body = soup.find('body')
if body:
html = soup.find('html')
if html:
body.extract()
html.insert(2, body)
mcharset = Tag(soup,'meta',[("http-equiv","Content-Type"),("content","text/html; charset=utf-8")])
soup.head.insert(1,mcharset)
return soup

View File

@ -0,0 +1,46 @@
__license__ = 'GPL v3'
__copyright__ = '2010, Darko Miletic <darko.miletic at gmail.com>'
'''
nightfliersbookspace.blogspot.com
'''
import re
from calibre.web.feeds.news import BasicNewsRecipe
class NightfliersBookspace(BasicNewsRecipe):
title = "Nightflier's Bookspace"
__author__ = 'Darko Miletic'
description = 'SF, Fantasy, Books, Knjige'
oldest_article = 35
max_articles_per_feed = 100
language = 'sr'
encoding = 'utf-8'
no_stylesheets = True
use_embedded_content = True
publication_type = 'blog'
cover_url = ''
extra_css = """
@font-face {font-family: "sans1";src:url(res:///opt/sony/ebook/FONT/tt0003m_.ttf)}
body{font-family: "Trebuchet MS",Trebuchet,Verdana,sans1,sans-serif}
.article_description{font-family: sans1, sans-serif}
img{margin-bottom: 0.8em; border: 1px solid #333333; padding: 4px }
"""
conversion_options = {
'comment' : description
, 'tags' : 'SF, fantasy, prevod, blog, Srbija'
, 'publisher': 'Ivan Jovanovic'
, 'language' : language
}
preprocess_regexps = [(re.compile(u'\u0110'), lambda match: u'\u00D0')]
feeds = [(u'Posts', u'http://nightfliersbookspace.blogspot.com/feeds/posts/default')]
def preprocess_html(self, soup):
for item in soup.findAll(style=True):
del item['style']
return self.adeify_images(soup)

View File

@ -12,6 +12,7 @@ class BBC(BasicNewsRecipe):
title = u'The Onion AV Club'
__author__ = 'Stephen Williams'
description = 'Film, Television and Music Reviews'
language = 'en'
no_stylesheets = True
oldest_article = 2
max_articles_per_feed = 100

View File

@ -0,0 +1,50 @@
__license__ = 'GPL v3'
__copyright__ = '2010, Larry Chan <larry1chan at gmail.com>'
'''
oriental daily
'''
from calibre.web.feeds.recipes import BasicNewsRecipe
class OrientalDaily(BasicNewsRecipe):
title = 'Oriental Dailly'
__author__ = 'Larry Chan'
description = 'News from HK'
oldest_article = 2
max_articles_per_feed = 100
simultaneous_downloads = 5
no_stylesheets = True
#delay = 1
use_embedded_content = False
encoding = 'utf8'
publisher = 'Oriental Daily'
category = 'news, HK, world'
language = 'zh'
publication_type = 'newsportal'
extra_css = ' body{ font-family: Verdana,Helvetica,Arial,sans-serif } .introduction{font-weight: bold} .story-feature{display: block; padding: 0; border: 1px solid; width: 40%; font-size: small} .story-feature h2{text-align: center; text-transform: uppercase} '
conversion_options = {
'comments' : description
,'tags' : category
,'language' : language
,'publisher' : publisher
,'linearize_tables': True
}
remove_tags_after = dict(id='bottomNavCTN')
keep_only_tags = [
dict(name='div', attrs={'id':['leadin', 'contentCTN-right']})
]
remove_tags = [
dict(name='div', attrs={'class':['largeAdsCTN', 'contentCTN-left', 'textAdsCTN', 'footerAds clear']}),
dict(name='div', attrs={'id':['articleNav']})
]
remove_attributes = ['width','height','href']
feeds = [(u'Oriental Daily', u'http://orientaldaily.on.cc/rss/news.xml')]

View File

@ -12,15 +12,18 @@ class PeterSchiff(BasicNewsRecipe):
description = 'Economic commentary'
publisher = 'Euro Pacific capital'
category = 'news, politics, economy, USA'
oldest_article = 15
oldest_article = 25
max_articles_per_feed = 200
no_stylesheets = True
encoding = 'cp1252'
encoding = 'utf8'
use_embedded_content = False
language = 'en'
country = 'US'
remove_empty_feeds = True
extra_css = ' body{font-family: Verdana,Times,serif } h1{text-align: left} img{margin-bottom: 0.4em} '
extra_css = """
body{font-family: Verdana,Times,serif }
.field-field-commentary-writer-name{font-weight: bold}
.field-items{display: inline}
"""
conversion_options = {
'comment' : description
@ -30,7 +33,15 @@ class PeterSchiff(BasicNewsRecipe):
, 'linearize_tables' : True
}
keep_only_tags = [dict(name='tr',attrs={'style':'vertical-align: top;'})]
keep_only_tags = [
dict(name='h2',attrs={'id':'page-title'})
,dict(name='div',attrs={'class':'node'})
]
remove_tags = [
dict(name=['meta','link','base','iframe','embed'])
,dict(attrs={'id':'text-zoom'})
]
remove_attributes=['track','linktype','lang']
feeds = [(u'Articles', u'http://feeds.feedburner.com/PeterSchiffsEconomicCommentary')]

View File

@ -31,7 +31,6 @@ class AdvancedUserRecipe1282101454(BasicNewsRecipe):
#The following will get read of the Gallery: links when found
def preprocess_html(self, soup) :
print 'SOUP IS: ', soup
weblinks = soup.findAll(['head','h2'])
if weblinks is not None:
for link in weblinks:

View File

@ -0,0 +1,110 @@
from calibre.web.feeds.news import re
from calibre.web.feeds.recipes import BasicNewsRecipe
from BeautifulSoup import Tag
class RevistaMuyInteresante(BasicNewsRecipe):
title = 'Revista Muy Interesante'
__author__ = 'Jefferson Frantz'
description = 'Revista de divulgacion'
timefmt = ' [%d %b, %Y]'
language = 'es'
no_stylesheets = True
remove_javascript = True
extra_css = ' .txt_articulo{ font-family: sans-serif; font-size: medium; text-align: justify } .contentheading{font-family: serif; font-size: large; font-weight: bold; color: #000000; text-align: center}'
def preprocess_html(self, soup):
for item in soup.findAll(style=True):
del item['style']
for img_tag in soup.findAll('img'):
imagen = img_tag
new_tag = Tag(soup,'p')
img_tag.replaceWith(new_tag)
div = soup.find(attrs={'class':'article_category'})
div.insert(0,imagen)
break
return soup
preprocess_regexps = [
(re.compile(r'<td class="contentheading" width="100%">.*?</td>', re.DOTALL|re.IGNORECASE), lambda match: '<td class="contentheading">' + match.group().replace('<td class="contentheading" width="100%">','').strip().replace('</td>','').strip() + '</td>'),
]
keep_only_tags = [dict(name='div', attrs={'class':['article']}),dict(name='td', attrs={'class':['txt_articulo']})]
remove_tags = [
dict(name=['object','link','script','ul'])
,dict(name='div', attrs={'id':['comment']})
,dict(name='td', attrs={'class':['buttonheading']})
,dict(name='div', attrs={'class':['tags_articles']})
,dict(name='table', attrs={'class':['pagenav']})
]
remove_tags_after = dict(name='div', attrs={'class':'tags_articles'})
#TO GET ARTICLES IN SECTION
def nz_parse_section(self, url):
soup = self.index_to_soup(url)
div = soup.find(attrs={'class':'contenido'})
current_articles = []
for x in div.findAllNext(attrs={'class':['headline']}):
a = x.find('a', href=True)
if a is None:
continue
title = self.tag_to_string(a)
url = a.get('href', False)
if not url or not title:
continue
if url.startswith('/'):
url = 'http://www.muyinteresante.es'+url
# self.log('\t\tFound article:', title)
# self.log('\t\t\t', url)
current_articles.append({'title': title, 'url':url,
'description':'', 'date':''})
return current_articles
# To GET SECTIONS
def parse_index(self):
feeds = []
for title, url in [
('Historia',
'http://www.muyinteresante.es/historia-articulos'),
('Ciencia',
'http://www.muyinteresante.es/ciencia-articulos'),
('Naturaleza',
'http://www.muyinteresante.es/naturaleza-articulos'),
('Tecnología',
'http://www.muyinteresante.es/tecnologia-articulos'),
('Salud',
'http://www.muyinteresante.es/salud-articulos'),
('Más Muy',
'http://www.muyinteresante.es/muy'),
('Innova - Automoción',
'http://www.muyinteresante.es/articulos-innovacion-autos'),
('Innova - Salud',
'http://www.muyinteresante.es/articulos-innovacion-salud'),
('Innova - Medio Ambiente',
'http://www.muyinteresante.es/articulos-innovacion-medio-ambiente'),
('Innova - Alimentación',
'http://www.muyinteresante.es/articulos-innovacion-alimentacion'),
('Innova - Sociedad',
'http://www.muyinteresante.es/articulos-innovacion-sociedad'),
('Innova - Tecnología',
'http://www.muyinteresante.es/articulos-innovacion-tecnologia'),
('Innova - Ocio',
'http://www.muyinteresante.es/articulos-innovacion-ocio'),
]:
articles = self.nz_parse_section(url)
if articles:
feeds.append((title, articles))
return feeds

View File

@ -0,0 +1,55 @@
#!/usr/bin/env python
__license__ = 'GPL v3'
__copyright__ = u'2010, Tomasz Dlugosz <tomek3d@gmail.com>'
'''
rmf24.pl
'''
import re
from calibre.web.feeds.news import BasicNewsRecipe
class RMF24_opinie(BasicNewsRecipe):
title = u'Rmf24.pl - Opinie'
description = u'Blogi, wywiady i komentarze ze strony rmf24.pl'
language = 'pl'
oldest_article = 7
max_articles_per_feed = 100
__author__ = u'Tomasz D\u0142ugosz'
no_stylesheets = True
remove_javascript = True
feeds = [(u'Blogi', u'http://www.rmf24.pl/opinie/blogi/feed'),
(u'Kontrwywiad', u'http://www.rmf24.pl/opinie/wywiady/kontrwywiad/feed'),
(u'Przes\u0142uchanie', u'http://www.rmf24.pl/opinie/wywiady/przesluchanie/feed'),
(u'Komentarze', u'http://www.rmf24.pl/opinie/komentarze/feed')]
keep_only_tags = [
dict(name='div', attrs={'class':'box articleSingle print'}),
dict(name='div', attrs={'class':'box articleSingle print singleCommentary'}),
dict(name='div', attrs={'class':'box articleSingle print blogSingleEntry'})]
remove_tags = [
dict(name='div', attrs={'class':'toTop'}),
dict(name='div', attrs={'class':'category'}),
dict(name='div', attrs={'class':'REMOVE'}),
dict(name='div', attrs={'class':'embed embedAd'})]
extra_css = '''
h1 { font-size: 1.2em; }
'''
# thanks to Kovid Goyal
def get_article_url(self, article):
link = article.get('link')
if '/audio,aId' not in link:
return link
preprocess_regexps = [
(re.compile(i[0], re.IGNORECASE | re.DOTALL), i[1]) for i in
[
(r'<h2>Zdj.cie</h2>', lambda match: ''),
(r'embed embed(Left|Right|Center) articleEmbed(Audio|Wideo articleEmbedVideo|ArticleFull|ArticleTitle|ArticleListTitle|AlbumHorizontal)">', lambda match: 'REMOVE">'),
(r'<a href="http://www.facebook.com/pages/RMF24pl/.*?>RMF24.pl</a> on Facebook</div>', lambda match: '</div>')
]
]

View File

@ -0,0 +1,47 @@
__license__ = 'GPL v3'
__copyright__ = '2010, Darko Miletic <darko.miletic at gmail.com>'
'''
rusiahoy.com
'''
from calibre.web.feeds.news import BasicNewsRecipe
class RusiaHoy(BasicNewsRecipe):
title = 'Rusia Hoy'
__author__ = 'Darko Miletic'
description = 'Noticias de Russia en castellano'
publisher = 'rusiahoy.com'
category = 'news, politics, Russia'
oldest_article = 7
max_articles_per_feed = 200
no_stylesheets = True
encoding = 'utf8'
use_embedded_content = False
language = 'es'
remove_empty_feeds = True
extra_css = """
body{font-family: Arial,sans-serif }
.article_article_title{font-size: xx-large; font-weight: bold}
.article_date{color: black; font-size: small}
"""
conversion_options = {
'comment' : description
, 'tags' : category
, 'publisher' : publisher
, 'language' : language
}
remove_tags = [dict(name=['meta','link','iframe','base','object','embed'])]
keep_only_tags=[ dict(attrs={'class':['article_rubric_title','article_date','article_article_title','article_article_lead']})
,dict(attrs={'class':'article_article_text'})
]
remove_attributes=['align','width','height']
feeds = [(u'Articulos', u'http://rusiahoy.com/xml/index.xml')]
def preprocess_html(self, soup):
for item in soup.findAll(style=True):
del item['style']
return soup

View File

@ -0,0 +1,78 @@
#!/usr/bin/env python
__license__ = 'GPL v3'
'''
sciencenews.org
'''
from calibre.web.feeds.news import BasicNewsRecipe
class ScienceNewsIssue(BasicNewsRecipe):
title = u'Science News Recent Issues'
__author__ = u'Darko Miletic, Sujata Raman and Starson17'
description = u'''Science News is an award-winning weekly
newsmagazine covering the most important research in all fields of science.
Its 16 pages each week are packed with short, accurate articles that appeal
to both general readers and scientists. Published since 1922, the magazine
now reaches about 150,000 subscribers and more than 1 million readers.
These are the latest News Items from Science News. This recipe downloads
the last 30 days worth of articles.'''
category = u'Science, Technology, News'
publisher = u'Society for Science & the Public'
oldest_article = 30
language = 'en'
max_articles_per_feed = 100
no_stylesheets = True
use_embedded_content = False
timefmt = ' [%A, %d %B, %Y]'
recursions = 1
remove_attributes = ['style']
conversion_options = {'linearize_tables' : True
, 'comment' : description
, 'tags' : category
, 'publisher' : publisher
, 'language' : language
}
extra_css = '''
.content_description{font-family:georgia ;font-size:x-large; color:#646464 ; font-weight:bold;}
.content_summary{font-family:georgia ;font-size:small ;color:#585858 ; font-weight:bold;}
.content_authors{font-family:helvetica,arial ;font-size: xx-small ;color:#14487E ;}
.content_edition{font-family:helvetica,arial ;font-size: xx-small ;}
.exclusive{color:#FF0000 ;}
.anonymous{color:#14487E ;}
.content_content{font-family:helvetica,arial ;font-size: medium ; color:#000000;}
.description{color:#585858;font-family:helvetica,arial ;font-size: large ;}
.credit{color:#A6A6A6;font-family:helvetica,arial ;font-size: xx-small ;}
'''
keep_only_tags = [ dict(name='div', attrs={'id':'column_action'}) ]
remove_tags_after = dict(name='ul', attrs={'id':'content_functions_bottom'})
remove_tags = [
dict(name='ul', attrs={'id':'content_functions_bottom'})
,dict(name='div', attrs={'id':['content_functions_top','breadcrumb_content']})
,dict(name='img', attrs={'class':'icon'})
,dict(name='div', attrs={'class': 'embiggen'})
]
feeds = [(u"Science News Current Issues", u'http://www.sciencenews.org/view/feed/type/edition/name/issues.rss')]
match_regexps = [
r'www.sciencenews.org/view/feature/id/',
r'www.sciencenews.org/view/generic/id'
]
def get_cover_url(self):
cover_url = None
index = 'http://www.sciencenews.org/view/home'
soup = self.index_to_soup(index)
link_item = soup.find(name = 'img',alt = "issue")
if link_item:
cover_url = 'http://www.sciencenews.org' + link_item['src'] + '.jpg'
return cover_url
def preprocess_html(self, soup):
for tag in soup.findAll(name=['span']):
tag.name = 'div'
return soup

View File

@ -28,7 +28,7 @@ class Sueddeutsche(BasicNewsRecipe):
"SKY_AD","NT1_AD","navbar1","sdesiteheader"]}),
dict(name='div', attrs={'class':["similar-article-box","artikelliste","nteaser301bg",
"pages closed","basebox right narrow"]}),
"pages closed","basebox right narrow","headslot galleried"]}),
dict(name='div', attrs={'class':["articleDistractor","listHeader","listHeader2","hr2",
"item","videoBigButton","articlefooter full-column",
@ -38,10 +38,11 @@ class Sueddeutsche(BasicNewsRecipe):
dict(name='div', attrs={'style':["position:relative;"]}),
dict(name='span', attrs={'class':["nlinkheaderteaserschwarz","artikelLink","r10000000"]}),
dict(name='table', attrs={'class':["stoerBS","kommentare","footer","pageBoxBot","pageAktiv","bgcontent"]}),
dict(name='ul', attrs={'class':["breadcrumb","articles","activities","sitenav"]}),
dict(name='ul', attrs={'class':["breadcrumb","articles","activities","sitenav","actions"]}),
dict(name='td', attrs={'class':["artikelDruckenRight"]}),
dict(name='p', text = "ANZEIGE")
]
remove_tags_after = [dict(name='div', attrs={'class':["themenbox full-column"]})]
extra_css = '''
h2{font-family:Arial,Helvetica,sans-serif; font-size: x-small; color: #003399;}
@ -70,9 +71,8 @@ class Sueddeutsche(BasicNewsRecipe):
(u'Reise', u'http://suche.sueddeutsche.de/query/reise/nav/%C2%A7ressort%3AReise/sort/-docdatetime?output=rss')
]
def print_version(self, url):
return url.replace('/text/', '/text/print.html')
main, sep, id = url.rpartition('/')
return main + '/2.220/' + id

View File

@ -7,6 +7,7 @@ class TechnologyReview(BasicNewsRecipe):
description = 'MIT Technology Magazine'
publisher = 'Technology Review Inc.'
category = 'Technology, Innovation, R&D'
language = 'en'
oldest_article = 14
max_articles_per_feed = 100
No_stylesheets = True

View File

@ -9,15 +9,19 @@ theage.com.au
from calibre import strftime
from calibre.web.feeds.news import BasicNewsRecipe
from calibre.ebooks.BeautifulSoup import BeautifulSoup
import re
class TheAge(BasicNewsRecipe):
title = 'The Age'
description = 'Business News, World News and Breaking News in Melbourne, Australia'
__author__ = 'Matthew Briggs'
language = 'en_AU'
title = 'The Age'
description = 'Business News, World News and Breaking News in Melbourne, Australia'
publication_type = 'newspaper'
__author__ = 'Matthew Briggs'
language = 'en_AU'
max_articles_per_feed = 1000
recursions = 0
remove_tags = [dict(name=['table', 'script', 'noscript', 'style']), dict(name='a', attrs={'href':'/'}), dict(name='a', attrs={'href':'/text/'})]
def get_browser(self):
br = BasicNewsRecipe.get_browser()
@ -28,22 +32,22 @@ class TheAge(BasicNewsRecipe):
soup = BeautifulSoup(self.browser.open('http://www.theage.com.au/text/').read())
feeds, articles = [], []
feed = None
section = None
sections = {}
for tag in soup.findAll(['h3', 'a']):
if tag.name == 'h3':
if articles:
feeds.append((feed, articles))
articles = []
feed = self.tag_to_string(tag)
elif feed is not None and tag.has_key('href') and tag['href'].strip():
section = self.tag_to_string(tag)
sections[section] = []
# Make sure to skip: <a href="/">TheAge</a>
elif section and tag.has_key('href') and len(tag['href'].strip())>1:
url = tag['href'].strip()
if url.startswith('/'):
url = 'http://www.theage.com.au' + url
url = 'http://www.theage.com.au' + url
title = self.tag_to_string(tag)
articles.append({
sections[section].append({
'title': title,
'url' : url,
'date' : strftime('%a, %d %b'),
@ -51,7 +55,58 @@ class TheAge(BasicNewsRecipe):
'content' : '',
})
feeds = []
# Insert feeds in specified order, if available
feedSort = [ 'National', 'World', 'Opinion', 'Columns', 'Business', 'Sport', 'Entertainment' ]
for i in feedSort:
if i in sections:
feeds.append((i,sections[i]))
# Done with the sorted feeds
for i in feedSort:
del sections[i]
# Append what is left over...
for i in sections:
feeds.append((i,sections[i]))
return feeds
def get_cover_url(self):
soup = BeautifulSoup(self.browser.open('http://www.theage.com.au/todays-paper').read())
for i in soup.findAll('a'):
href = i['href']
if href and re.match('http://www.theage.com.au/frontpage/[0-9]+/[0-9]+/[0-9]+/frontpage.pdf',href):
return href
return None
def preprocess_html(self,soup):
for p in soup.findAll('p'):
# Collapse the paragraph by joining the non-tag contents
contents = [i for i in p.contents if isinstance(i,unicode)]
if len(contents):
contents = ''.join(contents)
# Filter out what's left of the text-mode navigation stuff
if re.match('((\s)|(\&nbsp\;))*\[[\|\s*]*\]((\s)|(\&nbsp\;))*$',contents):
p.extract()
continue
# Shrink the fine print font
if contents=='This material is subject to copyright and any unauthorised use, copying or mirroring is prohibited.':
p['style'] = 'font-size:small'
continue
return soup

View File

@ -16,7 +16,7 @@ class DailyTelegraph(BasicNewsRecipe):
language = 'en_AU'
oldest_article = 2
max_articles_per_feed = 20
max_articles_per_feed = 30
remove_javascript = True
no_stylesheets = True
encoding = 'utf8'
@ -48,22 +48,24 @@ class DailyTelegraph(BasicNewsRecipe):
.caption{font-family:Trebuchet MS,Trebuchet,Helvetica,sans-serif; font-size: xx-small;}
'''
feeds = [(u'News', u'http://feeds.news.com.au/public/rss/2.0/aus_news_807.xml'),
feeds = [ (u'News', u'http://feeds.news.com.au/public/rss/2.0/aus_news_807.xml'),
(u'Opinion', u'http://feeds.news.com.au/public/rss/2.0/aus_opinion_58.xml'),
(u'Business', u'http://feeds.news.com.au/public/rss/2.0/aus_business_811.xml'),
(u'Media', u'http://feeds.news.com.au/public/rss/2.0/aus_media_57.xml'),
(u'Higher Education', u'http://feeds.news.com.au/public/rss/2.0/aus_higher_education_56.xml'),
(u'The Arts', u'http://feeds.news.com.au/public/rss/2.0/aus_arts_51.xml'),
(u'Commercial Property', u'http://feeds.news.com.au/public/rss/2.0/aus_business_commercial_property_708.xml'),
(u'The Nation', u'http://feeds.news.com.au/public/rss/2.0/aus_the_nation_62.xml'),
(u'Sport', u'http://feeds.news.com.au/public/rss/2.0/aus_sport_61.xml'),
(u'Travel', u'http://feeds.news.com.au/public/rss/2.0/aus_travel_and_indulgence_63.xml'),
(u'Defence', u'http://feeds.news.com.au/public/rss/2.0/aus_defence_54.xml'),
(u'Aviation', u'http://feeds.news.com.au/public/rss/2.0/aus_business_aviation_706.xml'),
(u'Mining', u'http://feeds.news.com.au/public/rss/2.0/aus_business_mining_704.xml'),
(u'World News', u'http://feeds.news.com.au/public/rss/2.0/aus_world_808.xml'),
(u'US Election', u'http://feeds.news.com.au/public/rss/2.0/aus_uselection_687.xml'),
(u'Climate', u'http://feeds.news.com.au/public/rss/2.0/aus_climate_809.xml'),
(u'Media', u'http://feeds.news.com.au/public/rss/2.0/aus_media_57.xml'),
(u'IT', u'http://feeds.news.com.au/public/rss/2.0/ausit_itnews_topstories_367.xml'),
(u'Exec Tech', u'http://feeds.news.com.au/public/rss/2.0/ausit_exec_topstories_385.xml'),
(u'Higher Education', u'http://feeds.news.com.au/public/rss/2.0/aus_higher_education_56.xml'),
(u'Arts', u'http://feeds.news.com.au/public/rss/2.0/aus_arts_51.xml'),
(u'Travel', u'http://feeds.news.com.au/public/rss/2.0/aus_travel_and_indulgence_63.xml'),
(u'Property', u'http://feeds.news.com.au/public/rss/2.0/aus_property_59.xml'),
(u'US Election', u'http://feeds.news.com.au/public/rss/2.0/aus_uselection_687.xml')]
(u'Sport', u'http://feeds.news.com.au/public/rss/2.0/aus_sport_61.xml'),
(u'Business', u'http://feeds.news.com.au/public/rss/2.0/aus_business_811.xml'),
(u'Aviation', u'http://feeds.news.com.au/public/rss/2.0/aus_business_aviation_706.xml'),
(u'Commercial Property', u'http://feeds.news.com.au/public/rss/2.0/aus_business_commercial_property_708.xml'),
(u'Mining', u'http://feeds.news.com.au/public/rss/2.0/aus_business_mining_704.xml')]
def get_article_url(self, article):
return article.id

View File

@ -1,103 +1,106 @@
#!/usr/bin/env python
__license__ = 'GPL v3'
__copyright__ = '2008-2009, Darko Miletic <darko.miletic at gmail.com>'
__copyright__ = '2009-2010, Darko Miletic <darko.miletic at gmail.com>'
'''
timesonline.co.uk
www.thetimes.co.uk
'''
import re
import urllib
from calibre.web.feeds.news import BasicNewsRecipe
from calibre.ebooks.BeautifulSoup import Tag
class Timesonline(BasicNewsRecipe):
title = 'The Times Online'
__author__ = 'Darko Miletic and Sujata Raman'
description = 'UK news'
publisher = 'timesonline.co.uk'
category = 'news, politics, UK'
oldest_article = 2
max_articles_per_feed = 100
no_stylesheets = True
use_embedded_content = False
simultaneous_downloads = 1
encoding = 'ISO-8859-1'
remove_javascript = True
language = 'en_GB'
recursions = 9
match_regexps = [r'http://www.timesonline.co.uk/.*page=[2-9]']
class TimesOnline(BasicNewsRecipe):
title = 'The Times UK'
__author__ = 'Darko Miletic'
description = 'news from United Kingdom and World'
language = 'en_GB'
publisher = 'Times Newspapers Ltd'
category = 'news, politics, UK'
oldest_article = 3
max_articles_per_feed = 100
no_stylesheets = True
use_embedded_content = False
encoding = 'utf-8'
delay = 1
needs_subscription = True
publication_type = 'newspaper'
masthead_url = 'http://www.thetimes.co.uk/tto/public/img/the_times_460.gif'
INDEX = 'http://www.thetimes.co.uk'
PREFIX = u'http://www.thetimes.co.uk/tto/'
extra_css = """
.f-ha{font-size: xx-large; font-weight: bold}
.f-author{font-family: Arial,Helvetica,sans-serif}
.caption{font-size: small}
body{font-family: Georgia,"Times New Roman",Times,serif}
"""
conversion_options = {
'comment' : description
, 'tags' : category
, 'publisher' : publisher
, 'language' : language
}
preprocess_regexps = [(re.compile(r'<!--.*?-->', re.DOTALL), lambda m: '')]
keep_only_tags = [
dict(name='div', attrs= {'id':['region-column1and2-layout2']}),
{'class' : ['subheading']},
dict(name='div', attrs= {'id':['dynamic-image-holder']}),
dict(name='div', attrs= {'class':['article-author']}),
dict(name='div', attrs= {'id':['related-article-links']}),
def get_browser(self):
br = BasicNewsRecipe.get_browser()
br.open('http://www.timesplus.co.uk/tto/news/?login=false&url=http://www.thetimes.co.uk/tto/news/?lightbox=false')
if self.username is not None and self.password is not None:
data = urllib.urlencode({ 'userName':self.username
,'password':self.password
,'keepMeLoggedIn':'false'
})
br.open('https://www.timesplus.co.uk/iam/app/authenticate',data)
return br
remove_tags = [
dict(name=['object','link','iframe','base','meta'])
,dict(attrs={'class':'tto-counter' })
]
remove_attributes=['lang']
keep_only_tags = [
dict(attrs={'class':'heading' })
,dict(attrs={'class':'f-author'})
,dict(attrs={'id':'bodycopy'})
]
remove_tags = [
dict(name=['embed','object','form','iframe']),
dict(name='span', attrs = {'class':'float-left padding-left-8 padding-top-2'}),
dict(name='div', attrs= {'id':['region-footer','region-column2-layout2','grid-column4','login-status','comment-sort-order']}),
dict(name='div', attrs= {'class': ['debate-quote-container','clear','your-comment','float-left related-attachements-container','float-left padding-bottom-5 padding-top-8','puff-top']}),
dict(name='span', attrs = {'id': ['comment-count']}),
dict(name='ul',attrs = {'id': 'read-all-comments'}),
dict(name='a', attrs = {'class':'reg-bold'}),
]
extra_css = '''
.small{font-family :Arial,Helvetica,sans-serif; font-size:x-small;}
.byline{font-family :Arial,Helvetica,sans-serif; font-size:x-small; background:#F8F1D8;}
.color-666{font-family :Arial,Helvetica,sans-serif; font-size:x-small; color:#666666; }
h1{font-family:Georgia,Times New Roman,Times,serif;font-size:large; }
.color-999 {color:#999999;}
.x-small {font-size:x-small;}
#related-article-links{font-family :Arial,Helvetica,sans-serif; font-size:small;}
h2{color:#333333;font-family :Georgia,Times New Roman,Times,serif; font-size:small;}
p{font-family :Arial,Helvetica,sans-serif; font-size:small;}
'''
feeds = [
(u'Top stories from Times Online', u'http://www.timesonline.co.uk/tol/feeds/rss/topstories.xml' ),
('Latest Business News', 'http://www.timesonline.co.uk/tol/feeds/rss/business.xml'),
('Economics', 'http://www.timesonline.co.uk/tol/feeds/rss/economics.xml'),
('World News', 'http://www.timesonline.co.uk/tol/feeds/rss/worldnews.xml'),
('UK News', 'http://www.timesonline.co.uk/tol/feeds/rss/uknews.xml'),
('Travel News', 'http://www.timesonline.co.uk/tol/feeds/rss/travel.xml'),
('Sports News', 'http://www.timesonline.co.uk/tol/feeds/rss/sport.xml'),
('Film News', 'http://www.timesonline.co.uk/tol/feeds/rss/film.xml'),
('Tech news', 'http://www.timesonline.co.uk/tol/feeds/rss/tech.xml'),
('Literary Supplement', 'http://www.timesonline.co.uk/tol/feeds/rss/thetls.xml'),
]
def get_cover_url(self):
cover_url = None
index = 'http://www.timesonline.co.uk/tol/newspapers/'
soup = self.index_to_soup(index)
link_item = soup.find(name = 'div',attrs ={'class': "float-left margin-right-15"})
if link_item:
cover_url = link_item.img['src']
return cover_url
def get_article_url(self, article):
return article.get('guid', None)
feeds = [
(u'UK News' , PREFIX + u'news/uk/?view=list' )
,(u'World' , PREFIX + u'news/world/?view=list' )
,(u'Politics' , PREFIX + u'news/politics/?view=list')
,(u'Health' , PREFIX + u'health/news/?view=list' )
,(u'Education' , PREFIX + u'education/?view=list' )
,(u'Technology' , PREFIX + u'technology/?view=list' )
,(u'Science' , PREFIX + u'science/?view=list' )
,(u'Environment' , PREFIX + u'environment/?view=list' )
,(u'Faith' , PREFIX + u'faith/?view=list' )
,(u'Opinion' , PREFIX + u'opinion/?view=list' )
,(u'Sport' , PREFIX + u'sport/?view=list' )
,(u'Business' , PREFIX + u'business/?view=list' )
,(u'Money' , PREFIX + u'money/?view=list' )
,(u'Life' , PREFIX + u'life/?view=list' )
,(u'Arts' , PREFIX + u'arts/?view=list' )
]
def preprocess_html(self, soup):
soup.html['xml:lang'] = self.language
soup.html['lang'] = self.language
mlang = Tag(soup,'meta',[("http-equiv","Content-Language"),("content",self.language)])
mcharset = Tag(soup,'meta',[("http-equiv","Content-Type"),("content","text/html; charset=ISO-8859-1")])
soup.head.insert(0,mlang)
soup.head.insert(1,mcharset)
for item in soup.findAll(style=True):
del item['style']
return self.adeify_images(soup)
def postprocess_html(self,soup,first):
for tag in soup.findAll(text = ['Previous Page','Next Page']):
tag.extract()
return soup
def parse_index(self):
totalfeeds = []
lfeeds = self.get_feeds()
for feedobj in lfeeds:
feedtitle, feedurl = feedobj
self.report_progress(0, _('Fetching feed')+' %s...'%(feedtitle if feedtitle else feedurl))
articles = []
soup = self.index_to_soup(feedurl)
for item in soup.findAll('td', attrs={'class':'title'}):
atag = item.find('a')
url = self.INDEX + atag['href']
title = self.tag_to_string(atag)
articles.append({
'title' :title
,'date' :''
,'url' :url
,'description':''
})
totalfeeds.append((feedtitle, articles))
return totalfeeds

View File

@ -21,16 +21,20 @@ class WashingtonPost(BasicNewsRecipe):
body{font-family:arial,helvetica,sans-serif}
'''
feeds = [ ('Today\'s Highlights', 'http://www.washingtonpost.com/wp-dyn/rss/linkset/2005/03/24/LI2005032400102.xml'),
('Politics', 'http://www.washingtonpost.com/wp-dyn/rss/politics/index.xml'),
('Nation', 'http://www.washingtonpost.com/wp-dyn/rss/nation/index.xml'),
('World', 'http://www.washingtonpost.com/wp-dyn/rss/world/index.xml'),
('Business', 'http://www.washingtonpost.com/wp-dyn/rss/business/index.xml'),
('Technology', 'http://www.washingtonpost.com/wp-dyn/rss/technology/index.xml'),
('Health', 'http://www.washingtonpost.com/wp-dyn/rss/health/index.xml'),
('Education', 'http://www.washingtonpost.com/wp-dyn/rss/education/index.xml'),
('Editorials', 'http://www.washingtonpost.com/wp-dyn/rss/linkset/2005/05/30/LI2005053000331.xml'),
]
feeds = [ ('Today\'s Highlights', 'http://www.washingtonpost.com/wp-dyn/rss/linkset/2005/03/24/LI2005032400102.xml'),
('Politics', 'http://www.washingtonpost.com/wp-dyn/rss/politics/index.xml'),
('Nation', 'http://www.washingtonpost.com/wp-dyn/rss/nation/index.xml'),
('World', 'http://www.washingtonpost.com/wp-dyn/rss/world/index.xml'),
('Business', 'http://www.washingtonpost.com/wp-dyn/rss/business/index.xml'),
('Technology', 'http://www.washingtonpost.com/wp-dyn/rss/technology/index.xml'),
('Health', 'http://www.washingtonpost.com/wp-dyn/rss/health/index.xml'),
('Education', 'http://www.washingtonpost.com/wp-dyn/rss/education/index.xml'),
('Style',
'http://www.washingtonpost.com/wp-dyn/rss/print/style/index.xml'),
('Sports',
'http://feeds.washingtonpost.com/wp-dyn/rss/linkset/2010/08/19/LI2010081904067_xml'),
('Editorials', 'http://www.washingtonpost.com/wp-dyn/rss/linkset/2005/05/30/LI2005053000331.xml'),
]
remove_tags = [{'id':['pfmnav', 'ArticleCommentsWrapper']}]

View File

@ -55,6 +55,9 @@ class WikiNews(BasicNewsRecipe):
rest, sep, article_id = url.rpartition('/')
return 'http://en.wikinews.org/w/index.php?title=' + article_id + '&printable=yes'
def get_cover_url(self):
return 'http://upload.wikimedia.org/wikipedia/commons/b/bd/Wikinews-logo-en.png'
def preprocess_html(self, soup):
mtag = '<meta http-equiv="Content-Language" content="en"/><meta http-equiv="Content-Type" content="text/html; charset=utf-8">'
soup.head.insert(0,mtag)

View File

@ -1,5 +1,5 @@
" Project wide builtins
let g:pyflakes_builtins += ["dynamic_property", "__", "P", "I"]
let g:pyflakes_builtins += ["dynamic_property", "__", "P", "I", "lopen"]
python << EOFPY
import os

View File

@ -63,7 +63,7 @@ class Check(Command):
description = 'Check for errors in the calibre source code'
BUILTINS = ['_', '__', 'dynamic_property', 'I', 'P']
BUILTINS = ['_', '__', 'dynamic_property', 'I', 'P', 'lopen']
CACHE = '.check-cache.pickle'
def get_files(self, cache):

View File

@ -123,7 +123,7 @@ class VMInstaller(Command):
subprocess.check_call(['scp',
self.VM_NAME+':build/calibre/'+installer, 'dist'])
if not os.path.exists(installer):
self.warn('Failed to download installer')
self.warn('Failed to download installer: '+installer)
raise SystemExit(1)
def clean(self):

View File

@ -6,9 +6,9 @@ __license__ = 'GPL v3'
__copyright__ = '2009, Kovid Goyal <kovid@kovidgoyal.net>'
__docformat__ = 'restructuredtext en'
import os, re, cStringIO, base64, httplib, subprocess
import os, re, cStringIO, base64, httplib, subprocess, hashlib, shutil
from subprocess import check_call
from tempfile import NamedTemporaryFile
from tempfile import NamedTemporaryFile, mkdtemp
from setup import Command, __version__, installer_name, __appname__
@ -331,5 +331,19 @@ class UploadToServer(Command):
%(__version__, DOWNLOADS), shell=True)
check_call('ssh divok /etc/init.d/apache2 graceful',
shell=True)
tdir = mkdtemp()
for installer in installers():
if not os.path.exists(installer):
continue
with open(installer, 'rb') as f:
raw = f.read()
fingerprint = hashlib.sha512(raw).hexdigest()
fname = os.path.basename(installer+'.sha512')
with open(os.path.join(tdir, fname), 'wb') as f:
f.write(fingerprint)
check_call('scp %s/*.sha512 divok:%s/signatures/' % (tdir, DOWNLOADS),
shell=True)
shutil.rmtree(tdir)

View File

@ -455,6 +455,24 @@ def prepare_string_for_xml(raw, attribute=False):
def isbytestring(obj):
return isinstance(obj, (str, bytes))
def force_unicode(obj, enc=preferred_encoding):
if isbytestring(obj):
try:
obj = obj.decode(enc)
except:
try:
obj = obj.decode(filesystem_encoding if enc ==
preferred_encoding else preferred_encoding)
except:
try:
obj = obj.decode('utf-8')
except:
obj = repr(obj)
if isbytestring(obj):
obj = obj.decode('utf-8')
return obj
def human_readable(size):
""" Convert a size in bytes into a human readable form """
divisor, suffix = 1, "B"

View File

@ -2,7 +2,7 @@ __license__ = 'GPL v3'
__copyright__ = '2008, Kovid Goyal kovid@kovidgoyal.net'
__docformat__ = 'restructuredtext en'
__appname__ = 'calibre'
__version__ = '0.7.20'
__version__ = '0.7.23'
__author__ = "Kovid Goyal <kovid@kovidgoyal.net>"
import re

View File

@ -218,7 +218,7 @@ class MetadataReaderPlugin(Plugin): # {{{
with the input data.
:param type: The type of file. Guaranteed to be one of the entries
in :attr:`file_types`.
:return: A :class:`calibre.ebooks.metadata.MetaInformation` object
:return: A :class:`calibre.ebooks.metadata.book.Metadata` object
'''
return None
# }}}
@ -248,7 +248,7 @@ class MetadataWriterPlugin(Plugin): # {{{
with the input data.
:param type: The type of file. Guaranteed to be one of the entries
in :attr:`file_types`.
:param mi: A :class:`calibre.ebooks.metadata.MetaInformation` object
:param mi: A :class:`calibre.ebooks.metadata.book.Metadata` object
'''
pass

View File

@ -226,8 +226,7 @@ class OPFMetadataReader(MetadataReaderPlugin):
def get_metadata(self, stream, ftype):
from calibre.ebooks.metadata.opf2 import OPF
from calibre.ebooks.metadata import MetaInformation
return MetaInformation(OPF(stream, os.getcwd()))
return OPF(stream, os.getcwd()).to_book_metadata()
class PDBMetadataReader(MetadataReaderPlugin):
@ -448,7 +447,7 @@ from calibre.devices.eb600.driver import EB600, COOL_ER, SHINEBOOK, \
BOOQ, ELONEX, POCKETBOOK301, MENTOR
from calibre.devices.iliad.driver import ILIAD
from calibre.devices.irexdr.driver import IREXDR1000, IREXDR800
from calibre.devices.jetbook.driver import JETBOOK, MIBUK
from calibre.devices.jetbook.driver import JETBOOK, MIBUK, JETBOOK_MINI
from calibre.devices.kindle.driver import KINDLE, KINDLE2, KINDLE_DX
from calibre.devices.nook.driver import NOOK
from calibre.devices.prs505.driver import PRS505
@ -462,7 +461,8 @@ from calibre.devices.hanvon.driver import N516, EB511, ALEX, AZBOOKA, THEBOOK
from calibre.devices.edge.driver import EDGE
from calibre.devices.teclast.driver import TECLAST_K3, NEWSMY, IPAPYRUS, SOVOS
from calibre.devices.sne.driver import SNE
from calibre.devices.misc import PALMPRE, AVANT, SWEEX, PDNOVEL, KOGAN, GEMEI
from calibre.devices.misc import PALMPRE, AVANT, SWEEX, PDNOVEL, KOGAN, \
GEMEI, VELOCITYMICRO, PDNOVEL_KOBO
from calibre.devices.folder_device.driver import FOLDER_DEVICE_FOR_CONFIG
from calibre.devices.kobo.driver import KOBO
@ -523,6 +523,7 @@ plugins += [
IREXDR1000,
IREXDR800,
JETBOOK,
JETBOOK_MINI,
MIBUK,
SHINEBOOK,
POCKETBOOK360,
@ -574,6 +575,8 @@ plugins += [
PDNOVEL,
SPECTRA,
GEMEI,
VELOCITYMICRO,
PDNOVEL_KOBO,
ITUNES,
]
plugins += [x for x in list(locals().values()) if isinstance(x, type) and \
@ -799,6 +802,17 @@ class Sending(PreferencesPlugin):
description = _('Control how calibre transfers files to your '
'ebook reader')
class Plugboard(PreferencesPlugin):
name = 'Plugboard'
icon = I('plugboard.png')
gui_name = _('Metadata plugboards')
category = 'Import/Export'
gui_category = _('Import/Export')
category_order = 3
name_order = 4
config_widget = 'calibre.gui2.preferences.plugboard'
description = _('Change metadata fields before saving/sending')
class Email(PreferencesPlugin):
name = 'Email'
icon = I('mail.png')
@ -859,8 +873,8 @@ class Misc(PreferencesPlugin):
description = _('Miscellaneous advanced configuration')
plugins += [LookAndFeel, Behavior, Columns, Toolbar, InputOptions,
CommonOptions, OutputOptions, Adding, Saving, Sending, Email, Server,
Plugins, Tweaks, Misc]
CommonOptions, OutputOptions, Adding, Saving, Sending, Plugboard,
Email, Server, Plugins, Tweaks, Misc]
#}}}

View File

@ -255,6 +255,9 @@ class OutputProfile(Plugin):
#: Unsupported unicode characters to be replaced during preprocessing
unsupported_unicode_chars = []
#: Number of ems that the left margin of a blockquote is rendered as
mobi_ems_per_blockquote = 1.0
@classmethod
def tags_to_string(cls, tags):
return escape(', '.join(tags))
@ -564,6 +567,7 @@ class KindleOutput(OutputProfile):
supports_mobi_indexing = True
periodical_date_in_title = False
ratings_char = u'\u2605'
mobi_ems_per_blockquote = 2.0
@classmethod
def tags_to_string(cls, tags):
@ -582,6 +586,7 @@ class KindleDXOutput(OutputProfile):
comic_screen_size = (741, 1022)
supports_mobi_indexing = True
periodical_date_in_title = False
mobi_ems_per_blockquote = 2.0
@classmethod
def tags_to_string(cls, tags):

View File

@ -36,11 +36,17 @@ Run an embedded python interpreter.
'plugin code.')
parser.add_option('--reinitialize-db', default=None,
help='Re-initialize the sqlite calibre database at the '
'specified path. Useful to recover from db corruption.')
'specified path. Useful to recover from db corruption.'
' You can also specify the path to an SQL dump which '
'will be used instead of trying to dump the database.'
' This can be useful when dumping fails, but dumping '
'with sqlite3 works.')
parser.add_option('-p', '--py-console', help='Run python console',
default=False, action='store_true')
return parser
def reinit_db(dbpath, callback=None):
def reinit_db(dbpath, callback=None, sql_dump=None):
if not os.path.exists(dbpath):
raise ValueError(dbpath + ' does not exist')
from calibre.library.sqlite import connect
@ -50,26 +56,32 @@ def reinit_db(dbpath, callback=None):
uv = conn.get('PRAGMA user_version;', all=False)
conn.execute('PRAGMA writable_schema=ON')
conn.commit()
sql_lines = conn.dump()
if sql_dump is None:
sql_lines = conn.dump()
else:
sql_lines = open(sql_dump, 'rb').read()
conn.close()
dest = dbpath + '.tmp'
try:
with closing(connect(dest, False)) as nconn:
nconn.execute('create temporary table temp_sequence(id INTEGER PRIMARY KEY AUTOINCREMENT)')
nconn.commit()
if callable(callback):
callback(len(sql_lines), True)
for i, line in enumerate(sql_lines):
try:
nconn.execute(line)
except:
import traceback
prints('SQL line %r failed with error:'%line)
prints(traceback.format_exc())
continue
finally:
if callable(callback):
callback(i, False)
if sql_dump is None:
if callable(callback):
callback(len(sql_lines), True)
for i, line in enumerate(sql_lines):
try:
nconn.execute(line)
except:
import traceback
prints('SQL line %r failed with error:'%line)
prints(traceback.format_exc())
continue
finally:
if callable(callback):
callback(i, False)
else:
nconn.executescript(sql_lines)
nconn.execute('pragma user_version=%d'%int(uv))
nconn.commit()
os.remove(dbpath)
@ -148,6 +160,9 @@ def main(args=sys.argv):
if len(args) > 1:
vargs.append(args[-1])
main(vargs)
elif opts.py_console:
from calibre.utils.pyconsole.main import main
main()
elif opts.command:
sys.argv = args[:1]
exec opts.command
@ -165,7 +180,10 @@ def main(args=sys.argv):
prints('CALIBRE_EXTENSIONS_PATH='+sys.extensions_location)
prints('CALIBRE_PYTHON_PATH='+os.pathsep.join(sys.path))
elif opts.reinitialize_db is not None:
reinit_db(opts.reinitialize_db)
sql_dump = None
if len(args) > 1 and os.access(args[-1], os.R_OK):
sql_dump = args[-1]
reinit_db(opts.reinitialize_db, sql_dump=sql_dump)
else:
from calibre import ipython
ipython()

View File

@ -56,6 +56,7 @@ def get_connected_device():
return dev
def debug(ioreg_to_tmp=False, buf=None):
import textwrap
from calibre.customize.ui import device_plugins
from calibre.devices.scanner import DeviceScanner, win_pnp_drives
from calibre.constants import iswindows, isosx, __version__
@ -95,13 +96,19 @@ def debug(ioreg_to_tmp=False, buf=None):
ioreg += 'Output from osx_get_usb_drives:\n'+drives+'\n\n'
ioreg += Device.run_ioreg()
connected_devices = []
for dev in sorted(device_plugins(), cmp=lambda
x,y:cmp(x.__class__.__name__, y.__class__.__name__)):
out('Looking for', dev.__class__.__name__)
devplugins = list(sorted(device_plugins(), cmp=lambda
x,y:cmp(x.__class__.__name__, y.__class__.__name__)))
out('Available plugins:', textwrap.fill(' '.join([x.__class__.__name__ for x in
devplugins])))
out(' ')
out('Looking for devices...')
for dev in devplugins:
connected, det = s.is_device_connected(dev, debug=True)
if connected:
out('\t\tDetected possible device', dev.__class__.__name__)
connected_devices.append((dev, det))
out(' ')
errors = {}
success = False
out('Devices possibly connected:', end=' ')

View File

@ -13,12 +13,12 @@ from calibre.devices.errors import UserFeedback
from calibre.devices.usbms.deviceconfig import DeviceConfig
from calibre.devices.interface import DevicePlugin
from calibre.ebooks.BeautifulSoup import BeautifulSoup
from calibre.ebooks.metadata import MetaInformation, authors_to_string
from calibre.ebooks.metadata import authors_to_string, MetaInformation
from calibre.ebooks.metadata.book.base import Metadata
from calibre.ebooks.metadata.epub import set_metadata
from calibre.library.server.utils import strftime
from calibre.utils.config import config_dir
from calibre.utils.config import config_dir, prefs
from calibre.utils.date import isoformat, now, parse_date
from calibre.utils.localization import get_lang
from calibre.utils.logging import Log
from calibre.utils.zipfile import ZipFile
@ -67,6 +67,8 @@ class ITUNES(DriverBase):
Delete:
delete_books()
remove_books_from_metadata()
use_plugboard_ext()
set_plugboard()
sync_booklists()
card_prefix()
free_space()
@ -75,6 +77,8 @@ class ITUNES(DriverBase):
set_progress_reporter()
upload_books()
add_books_to_metadata()
use_plugboard_ext()
set_plugboard()
set_progress_reporter()
sync_booklists()
card_prefix()
@ -105,6 +109,9 @@ class ITUNES(DriverBase):
PRODUCT_ID = [0x1292,0x1293,0x1294,0x1297,0x1299,0x129a]
BCD = [0x01]
# Plugboard ID
DEVICE_PLUGBOARD_NAME = 'APPLE'
# iTunes enumerations
Audiobooks = [
'Audible file',
@ -163,6 +170,7 @@ class ITUNES(DriverBase):
# Properties
cached_books = {}
cache_dir = os.path.join(config_dir, 'caches', 'itunes')
calibre_library_path = prefs['library_path']
archive_path = os.path.join(cache_dir, "thumbs.zip")
description_prefix = "added by calibre"
ejected = False
@ -172,6 +180,8 @@ class ITUNES(DriverBase):
log = Log()
manual_sync_mode = False
path_template = 'iTunes/%s - %s.%s'
plugboards = None
plugboard_func = None
problem_titles = []
problem_msg = None
report_progress = None
@ -249,6 +259,8 @@ class ITUNES(DriverBase):
self.report_progress(1.0, _('Updating device metadata listing...'))
# Add new books to booklists[0]
# Charles thinks this should be
# for new_book in metadata[0]:
for new_book in locations[0]:
if DEBUG:
self.log.info(" adding '%s' by '%s' to booklists[0]" %
@ -813,6 +825,15 @@ class ITUNES(DriverBase):
'''
self.report_progress = report_progress
def set_plugboards(self, plugboards, pb_func):
# This method is called with the plugboard that matches the format
# declared in use_plugboard_ext and a device name of ITUNES
if DEBUG:
self.log.info("ITUNES.set_plugboard()")
#self.log.info(' using plugboard %s' % plugboards)
self.plugboards = plugboards
self.plugboard_func = pb_func
def sync_booklists(self, booklists, end_session=True):
'''
Update metadata on device.
@ -871,7 +892,7 @@ class ITUNES(DriverBase):
once uploaded to the device. len(names) == len(files)
:return: A list of 3-element tuples. The list is meant to be passed
to L{add_books_to_metadata}.
:metadata: If not None, it is a list of :class:`MetaInformation` objects.
:metadata: If not None, it is a list of :class:`Metadata` objects.
The idea is to use the metadata to determine where on the device to
put the book. len(metadata) == len(files). Apart from the regular
cover (path to cover), there may also be a thumbnail attribute, which should
@ -976,7 +997,6 @@ class ITUNES(DriverBase):
self._dump_cached_books(header="after upload_books()",indent=2)
return (new_booklist, [], [])
# Private methods
def _add_device_book(self,fpath, metadata):
'''
@ -1190,6 +1210,10 @@ class ITUNES(DriverBase):
except:
self.problem_titles.append("'%s' by %s" % (metadata.title, metadata.author[0]))
self.log.error(" error scaling '%s' for '%s'" % (metadata.cover,metadata.title))
import traceback
traceback.print_exc()
return thumb
if isosx:
@ -1255,7 +1279,10 @@ class ITUNES(DriverBase):
self.problem_titles.append("'%s' by %s" % (metadata.title, metadata.author[0]))
self.log.error(" error converting '%s' to thumb for '%s'" % (metadata.cover,metadata.title))
finally:
zfw.close()
try:
zfw.close()
except:
pass
else:
if DEBUG:
self.log.info(" no cover defined in metadata for '%s'" % metadata.title)
@ -1272,10 +1299,10 @@ class ITUNES(DriverBase):
this_book.db_id = None
this_book.device_collections = []
this_book.format = format
this_book.library_id = lb_added
this_book.library_id = lb_added # ??? GR
this_book.path = path
this_book.thumbnail = thumb
this_book.iTunes_id = lb_added
this_book.iTunes_id = lb_added # ??? GR
this_book.uuid = metadata.uuid
if isosx:
@ -1321,8 +1348,8 @@ class ITUNES(DriverBase):
plist = None
if plist:
if DEBUG:
self.log.info(" _delete_iTunesMetadata_plist():")
self.log.info(" deleting '%s'\n from '%s'" % (pl_name,fpath))
self.log.info(" _delete_iTunesMetadata_plist():")
self.log.info(" deleting '%s'\n from '%s'" % (pl_name,fpath))
zf.delete(pl_name)
zf.close()
@ -2212,6 +2239,7 @@ class ITUNES(DriverBase):
(self.iTunes.name(), self.iTunes.version(), self.initial_status,
self.version[0],self.version[1],self.version[2]))
self.log.info(" iTunes_media: %s" % self.iTunes_media)
self.log.info(" calibre_library_path: %s" % self.calibre_library_path)
if iswindows:
'''
@ -2265,6 +2293,7 @@ class ITUNES(DriverBase):
(self.iTunes.Windows[0].name, self.iTunes.Version, self.initial_status,
self.version[0],self.version[1],self.version[2]))
self.log.info(" iTunes_media: %s" % self.iTunes_media)
self.log.info(" calibre_library_path: %s" % self.calibre_library_path)
def _purge_orphans(self,library_books, cached_books):
'''
@ -2367,7 +2396,8 @@ class ITUNES(DriverBase):
'''
iTunes does not delete books from storage when removing from database
We only want to delete stored copies if the file is stored in iTunes
We don't want to delete files stored outside of iTunes
We don't want to delete files stored outside of iTunes.
Also confirm that storage_path does not point into calibre's storage.
'''
if DEBUG:
self.log.info(" ITUNES._remove_from_iTunes():")
@ -2375,7 +2405,8 @@ class ITUNES(DriverBase):
if isosx:
try:
storage_path = os.path.split(cached_book['lib_book'].location().path)
if cached_book['lib_book'].location().path.startswith(self.iTunes_media):
if cached_book['lib_book'].location().path.startswith(self.iTunes_media) and \
not storage_path[0].startswith(prefs['library_path']):
title_storage_path = storage_path[0]
if DEBUG:
self.log.info(" removing title_storage_path: %s" % title_storage_path)
@ -2426,7 +2457,8 @@ class ITUNES(DriverBase):
path = book.Location
if book:
if self.iTunes_media and path.startswith(self.iTunes_media):
if self.iTunes_media and path.startswith(self.iTunes_media) and \
not path.startswith(prefs['library_path']):
storage_path = os.path.split(path)
if DEBUG:
self.log.info(" removing '%s' at %s" %
@ -2453,11 +2485,17 @@ class ITUNES(DriverBase):
if DEBUG:
self.log.info(" unable to remove '%s' from iTunes" % cached_book['title'])
def title_sorter(self, title):
return re.sub('^\s*A\s+|^\s*The\s+|^\s*An\s+', '', title).rstrip()
def _update_epub_metadata(self, fpath, metadata):
'''
'''
self.log.info(" ITUNES._update_epub_metadata()")
# Fetch plugboard updates
metadata_x = self._xform_metadata_via_plugboard(metadata, 'epub')
# Refresh epub metadata
with open(fpath,'r+b') as zfo:
# Touch the OPF timestamp
@ -2489,9 +2527,14 @@ class ITUNES(DriverBase):
self.log.info(" add timestamp: %s" % metadata.timestamp)
# Force the language declaration for iBooks 1.1
metadata.language = get_lang().replace('_', '-')
#metadata.language = get_lang().replace('_', '-')
# Updates from metadata plugboard (ignoring publisher)
metadata.language = metadata_x.language
if DEBUG:
self.log.info(" rewriting language: <dc:language>%s</dc:language>" % metadata.language)
if metadata.language != metadata_x.language:
self.log.info(" rewriting language: <dc:language>%s</dc:language>" % metadata.language)
zf_opf.close()
@ -2569,75 +2612,97 @@ class ITUNES(DriverBase):
if DEBUG:
self.log.info(" ITUNES._update_iTunes_metadata()")
strip_tags = re.compile(r'<[^<]*?/?>')
STRIP_TAGS = re.compile(r'<[^<]*?/?>')
# Update metadata from plugboard
# If self.plugboard is None (no transforms), original metadata is returned intact
metadata_x = self._xform_metadata_via_plugboard(metadata, this_book.format)
if isosx:
if lb_added:
lb_added.album.set(metadata.title)
lb_added.artist.set(authors_to_string(metadata.authors))
lb_added.composer.set(metadata.uuid)
lb_added.name.set(metadata_x.title)
lb_added.album.set(metadata_x.title)
lb_added.artist.set(authors_to_string(metadata_x.authors))
lb_added.composer.set(metadata_x.uuid)
lb_added.description.set("%s %s" % (self.description_prefix,strftime('%Y-%m-%d %H:%M:%S')))
lb_added.enabled.set(True)
lb_added.sort_artist.set(metadata.author_sort.title())
lb_added.sort_name.set(this_book.title_sorter)
if this_book.format == 'pdf':
lb_added.name.set(metadata.title)
lb_added.sort_artist.set(metadata_x.author_sort.title())
lb_added.sort_name.set(metadata.title_sort)
if db_added:
db_added.album.set(metadata.title)
db_added.artist.set(authors_to_string(metadata.authors))
db_added.composer.set(metadata.uuid)
db_added.name.set(metadata_x.title)
db_added.album.set(metadata_x.title)
db_added.artist.set(authors_to_string(metadata_x.authors))
db_added.composer.set(metadata_x.uuid)
db_added.description.set("%s %s" % (self.description_prefix,strftime('%Y-%m-%d %H:%M:%S')))
db_added.enabled.set(True)
db_added.sort_artist.set(metadata.author_sort.title())
db_added.sort_name.set(this_book.title_sorter)
if this_book.format == 'pdf':
db_added.name.set(metadata.title)
db_added.sort_artist.set(metadata_x.author_sort.title())
db_added.sort_name.set(metadata.title_sort)
if metadata.comments:
if metadata_x.comments:
if lb_added:
lb_added.comment.set(strip_tags.sub('',metadata.comments))
lb_added.comment.set(STRIP_TAGS.sub('',metadata_x.comments))
if db_added:
db_added.comment.set(strip_tags.sub('',metadata.comments))
db_added.comment.set(STRIP_TAGS.sub('',metadata_x.comments))
if metadata.rating:
if metadata_x.rating:
if lb_added:
lb_added.rating.set(metadata.rating*10)
lb_added.rating.set(metadata_x.rating*10)
# iBooks currently doesn't allow setting rating ... ?
try:
if db_added:
db_added.rating.set(metadata.rating*10)
db_added.rating.set(metadata_x.rating*10)
except:
pass
# Set genre from series if available, else first alpha tag
# Otherwise iTunes grabs the first dc:subject from the opf metadata
if metadata.series and self.settings().read_metadata:
# self.settings().read_metadata is used as a surrogate for "Use Series name as Genre"
if metadata_x.series and self.settings().read_metadata:
if DEBUG:
self.log.info(" ITUNES._update_iTunes_metadata()")
self.log.info(" using Series name as Genre")
# Format the index as a sort key
index = metadata.series_index
index = metadata_x.series_index
integer = int(index)
fraction = index-integer
series_index = '%04d%s' % (integer, str('%0.4f' % fraction).lstrip('0'))
if lb_added:
lb_added.sort_name.set("%s %s" % (metadata.series, series_index))
lb_added.genre.set(metadata.series)
lb_added.episode_ID.set(metadata.series)
lb_added.episode_number.set(metadata.series_index)
lb_added.sort_name.set("%s %s" % (self.title_sorter(metadata_x.series), series_index))
lb_added.episode_ID.set(metadata_x.series)
lb_added.episode_number.set(metadata_x.series_index)
# If no plugboard transform applied to tags, change the Genre/Category to Series
if metadata.tags == metadata_x.tags:
lb_added.genre.set(self.title_sorter(metadata_x.series))
else:
for tag in metadata_x.tags:
if self._is_alpha(tag[0]):
lb_added.genre.set(tag)
break
if db_added:
db_added.sort_name.set("%s %s" % (metadata.series, series_index))
db_added.genre.set(metadata.series)
db_added.episode_ID.set(metadata.series)
db_added.episode_number.set(metadata.series_index)
db_added.sort_name.set("%s %s" % (self.title_sorter(metadata_x.series), series_index))
db_added.episode_ID.set(metadata_x.series)
db_added.episode_number.set(metadata_x.series_index)
elif metadata.tags:
# If no plugboard transform applied to tags, change the Genre/Category to Series
if metadata.tags == metadata_x.tags:
db_added.genre.set(self.title_sorter(metadata_x.series))
else:
for tag in metadata_x.tags:
if self._is_alpha(tag[0]):
db_added.genre.set(tag)
break
elif metadata_x.tags is not None:
if DEBUG:
self.log.info(" %susing Tag as Genre" %
"no Series name available, " if self.settings().read_metadata else '')
for tag in metadata.tags:
for tag in metadata_x.tags:
if self._is_alpha(tag[0]):
if lb_added:
lb_added.genre.set(tag)
@ -2647,40 +2712,38 @@ class ITUNES(DriverBase):
elif iswindows:
if lb_added:
lb_added.Album = metadata.title
lb_added.Artist = authors_to_string(metadata.authors)
lb_added.Composer = metadata.uuid
lb_added.Name = metadata_x.title
lb_added.Album = metadata_x.title
lb_added.Artist = authors_to_string(metadata_x.authors)
lb_added.Composer = metadata_x.uuid
lb_added.Description = ("%s %s" % (self.description_prefix,strftime('%Y-%m-%d %H:%M:%S')))
lb_added.Enabled = True
lb_added.SortArtist = (metadata.author_sort.title())
lb_added.SortName = (this_book.title_sorter)
if this_book.format == 'pdf':
lb_added.Name = metadata.title
lb_added.SortArtist = metadata_x.author_sort.title()
lb_added.SortName = metadata.title_sort
if db_added:
db_added.Album = metadata.title
db_added.Artist = authors_to_string(metadata.authors)
db_added.Composer = metadata.uuid
db_added.Name = metadata_x.title
db_added.Album = metadata_x.title
db_added.Artist = authors_to_string(metadata_x.authors)
db_added.Composer = metadata_x.uuid
db_added.Description = ("%s %s" % (self.description_prefix,strftime('%Y-%m-%d %H:%M:%S')))
db_added.Enabled = True
db_added.SortArtist = (metadata.author_sort.title())
db_added.SortName = (this_book.title_sorter)
if this_book.format == 'pdf':
db_added.Name = metadata.title
db_added.SortArtist = metadata_x.author_sort.title()
db_added.SortName = metadata.title_sort
if metadata.comments:
if metadata_x.comments:
if lb_added:
lb_added.Comment = (strip_tags.sub('',metadata.comments))
lb_added.Comment = (STRIP_TAGS.sub('',metadata_x.comments))
if db_added:
db_added.Comment = (strip_tags.sub('',metadata.comments))
db_added.Comment = (STRIP_TAGS.sub('',metadata_x.comments))
if metadata.rating:
if metadata_x.rating:
if lb_added:
lb_added.AlbumRating = (metadata.rating*10)
lb_added.AlbumRating = (metadata_x.rating*10)
# iBooks currently doesn't allow setting rating ... ?
try:
if db_added:
db_added.AlbumRating = (metadata.rating*10)
db_added.AlbumRating = (metadata_x.rating*10)
except:
if DEBUG:
self.log.warning(" iTunes automation interface reported an error"
@ -2690,36 +2753,54 @@ class ITUNES(DriverBase):
# Otherwise iBooks uses first <dc:subject> from opf
# iTunes balks on setting EpisodeNumber, but it sticks (9.1.1.12)
if metadata.series and self.settings().read_metadata:
if metadata_x.series and self.settings().read_metadata:
if DEBUG:
self.log.info(" using Series name as Genre")
# Format the index as a sort key
index = metadata.series_index
index = metadata_x.series_index
integer = int(index)
fraction = index-integer
series_index = '%04d%s' % (integer, str('%0.4f' % fraction).lstrip('0'))
if lb_added:
lb_added.SortName = "%s %s" % (metadata.series, series_index)
lb_added.Genre = metadata.series
lb_added.EpisodeID = metadata.series
lb_added.SortName = "%s %s" % (self.title_sorter(metadata_x.series), series_index)
lb_added.EpisodeID = metadata_x.series
try:
lb_added.EpisodeNumber = metadata.series_index
lb_added.EpisodeNumber = metadata_x.series_index
except:
pass
# If no plugboard transform applied to tags, change the Genre/Category to Series
if metadata.tags == metadata_x.tags:
lb_added.Genre = self.title_sorter(metadata_x.series)
else:
for tag in metadata_x.tags:
if self._is_alpha(tag[0]):
lb_added.Genre = tag
break
if db_added:
db_added.SortName = "%s %s" % (metadata.series, series_index)
db_added.Genre = metadata.series
db_added.EpisodeID = metadata.series
db_added.SortName = "%s %s" % (self.title_sorter(metadata_x.series), series_index)
db_added.EpisodeID = metadata_x.series
try:
db_added.EpisodeNumber = metadata.series_index
db_added.EpisodeNumber = metadata_x.series_index
except:
if DEBUG:
self.log.warning(" iTunes automation interface reported an error"
" setting EpisodeNumber on iDevice")
elif metadata.tags:
# If no plugboard transform applied to tags, change the Genre/Category to Series
if metadata.tags == metadata_x.tags:
db_added.Genre = self.title_sorter(metadata_x.series)
else:
for tag in metadata_x.tags:
if self._is_alpha(tag[0]):
db_added.Genre = tag
break
elif metadata_x.tags is not None:
if DEBUG:
self.log.info(" using Tag as Genre")
for tag in metadata.tags:
for tag in metadata_x.tags:
if self._is_alpha(tag[0]):
if lb_added:
lb_added.Genre = tag
@ -2727,6 +2808,36 @@ class ITUNES(DriverBase):
db_added.Genre = tag
break
def _xform_metadata_via_plugboard(self, book, format):
''' Transform book metadata from plugboard templates '''
if DEBUG:
self.log.info(" ITUNES._update_metadata_from_plugboard()")
if self.plugboard_func:
pb = self.plugboard_func(self.DEVICE_PLUGBOARD_NAME, format, self.plugboards)
newmi = book.deepcopy_metadata()
newmi.template_to_attribute(book, pb)
if DEBUG:
self.log.info(" transforming %s using %s:" % (format, pb))
self.log.info(" title: %s %s" % (book.title, ">>> %s" %
newmi.title if book.title != newmi.title else ''))
self.log.info(" title_sort: %s %s" % (book.title_sort, ">>> %s" %
newmi.title_sort if book.title_sort != newmi.title_sort else ''))
self.log.info(" authors: %s %s" % (book.authors, ">>> %s" %
newmi.authors if book.authors != newmi.authors else ''))
self.log.info(" author_sort: %s %s" % (book.author_sort, ">>> %s" %
newmi.author_sort if book.author_sort != newmi.author_sort else ''))
self.log.info(" language: %s %s" % (book.language, ">>> %s" %
newmi.language if book.language != newmi.language else ''))
self.log.info(" publisher: %s %s" % (book.publisher, ">>> %s" %
newmi.publisher if book.publisher != newmi.publisher else ''))
self.log.info(" tags: %s %s" % (book.tags, ">>> %s" %
newmi.tags if book.tags != newmi.tags else ''))
else:
newmi = book
return newmi
class ITUNES_ASYNC(ITUNES):
'''
This subclass allows the user to interact directly with iTunes via a menu option
@ -2737,6 +2848,9 @@ class ITUNES_ASYNC(ITUNES):
icon = I('devices/itunes.png')
description = _('Communicate with iTunes.')
# Plugboard ID
DEVICE_PLUGBOARD_NAME = 'APPLE'
connected = False
def __init__(self,path):
@ -3008,18 +3122,12 @@ class BookList(list):
'''
return {}
class Book(MetaInformation):
class Book(Metadata):
'''
A simple class describing a book in the iTunes Books Library.
- See ebooks.metadata.__init__ for all fields
See ebooks.metadata.book.base
'''
def __init__(self,title,author):
MetaInformation.__init__(self, title, authors=[author])
Metadata.__init__(self, title, authors=[author])
@dynamic_property
def title_sorter(self):
doc = '''String to sort the title. If absent, title is returned'''
def fget(self):
return re.sub('^\s*A\s+|^\s*The\s+|^\s*An\s+', '', self.title).rstrip()
return property(doc=doc, fget=fget)

View File

@ -16,10 +16,12 @@ class FOLDER_DEVICE_FOR_CONFIG(USBMS):
description = _('Use an arbitrary folder as a device.')
author = 'John Schember/Charles Haley'
supported_platforms = ['windows', 'osx', 'linux']
FORMATS = ['epub', 'fb2', 'mobi', 'azw', 'lrf', 'tcr', 'pmlz', 'lit', 'rtf', 'rb', 'pdf', 'oeb', 'txt', 'pdb']
FORMATS = ['epub', 'fb2', 'mobi', 'azw', 'lrf', 'tcr', 'pmlz', 'lit',
'rtf', 'rb', 'pdf', 'oeb', 'txt', 'pdb', 'prc']
VENDOR_ID = 0xffff
PRODUCT_ID = 0xffff
BCD = 0xffff
DEVICE_PLUGBOARD_NAME = 'FOLDER_DEVICE'
class FOLDER_DEVICE(USBMS):
@ -30,15 +32,16 @@ class FOLDER_DEVICE(USBMS):
description = _('Use an arbitrary folder as a device.')
author = 'John Schember/Charles Haley'
supported_platforms = ['windows', 'osx', 'linux']
FORMATS = ['epub', 'fb2', 'mobi', 'azw', 'lrf', 'tcr', 'pmlz', 'lit', 'rtf', 'rb', 'pdf', 'oeb', 'txt', 'pdb']
FORMATS = FOLDER_DEVICE_FOR_CONFIG.FORMATS
VENDOR_ID = 0xffff
PRODUCT_ID = 0xffff
BCD = 0xffff
DEVICE_PLUGBOARD_NAME = 'FOLDER_DEVICE'
THUMBNAIL_HEIGHT = 68 # Height for thumbnails on device
CAN_SET_METADATA = True
CAN_SET_METADATA = ['title', 'authors']
SUPPORTS_SUB_DIRS = True
#: Icon for this device

View File

@ -7,7 +7,7 @@ __docformat__ = 'restructuredtext en'
'''
Device driver for Hanvon devices
'''
import re
import re, os
from calibre.devices.usbms.driver import USBMS
@ -59,18 +59,59 @@ class ALEX(N516):
description = _('Communicate with the SpringDesign Alex eBook reader.')
author = 'Kovid Goyal'
FORMATS = ['epub', 'pdf']
FORMATS = ['epub', 'fb2', 'pdf']
VENDOR_NAME = 'ALEX'
WINDOWS_MAIN_MEM = 'READER'
MAIN_MEMORY_VOLUME_LABEL = 'Alex Internal Memory'
EBOOK_DIR_MAIN = 'eBooks'
SUPPORTS_SUB_DIRS = True
SUPPORTS_SUB_DIRS = False
THUMBNAIL_HEIGHT = 120
def can_handle(self, device_info, debug=False):
return is_alex(device_info)
def alex_cpath(self, file_abspath):
base = os.path.dirname(file_abspath)
name = os.path.splitext(os.path.basename(file_abspath))[0] + '.png'
return os.path.join(base, 'covers', name)
def upload_cover(self, path, filename, metadata):
from calibre.ebooks import calibre_cover
from calibre.utils.magick.draw import thumbnail
coverdata = getattr(metadata, 'thumbnail', None)
if coverdata and coverdata[2]:
cover = coverdata[2]
else:
cover = calibre_cover(metadata.get('title', _('Unknown')),
metadata.get('authors', _('Unknown')))
cover = thumbnail(cover, width=self.THUMBNAIL_HEIGHT,
height=self.THUMBNAIL_HEIGHT, fmt='png')[-1]
cpath = self.alex_cpath(os.path.join(path, filename))
cdir = os.path.dirname(cpath)
if not os.path.exists(cdir):
os.makedirs(cdir)
with open(cpath, 'wb') as coverfile:
coverfile.write(cover)
def delete_books(self, paths, end_session=True):
for i, path in enumerate(paths):
self.report_progress((i+1) / float(len(paths)), _('Removing books from device...'))
path = self.normalize_path(path)
if os.path.exists(path):
# Delete the ebook
os.unlink(path)
try:
cpath = self.alex_cpath(path)
if os.path.exists(cpath):
os.remove(cpath)
except:
pass
self.report_progress(1.0, _('Removing books from device...'))
class AZBOOKA(ALEX):
name = 'Azbooka driver'
@ -83,10 +124,13 @@ class AZBOOKA(ALEX):
MAIN_MEMORY_VOLUME_LABEL = 'Azbooka Internal Memory'
EBOOK_DIR_MAIN = ''
SUPPORTS_SUB_DIRS = True
def can_handle(self, device_info, debug=False):
return not is_alex(device_info)
def upload_cover(self, path, filename, metadata):
pass
class EB511(USBMS):
name = 'Elonex EB 511 driver'

View File

@ -37,7 +37,7 @@ class DevicePlugin(Plugin):
THUMBNAIL_HEIGHT = 68
#: Whether the metadata on books can be set via the GUI.
CAN_SET_METADATA = True
CAN_SET_METADATA = ['title', 'authors', 'collections']
#: Path separator for paths to books on device
path_sep = os.sep
@ -316,7 +316,7 @@ class DevicePlugin(Plugin):
being uploaded to the device.
:param names: A list of file names that the books should have
once uploaded to the device. len(names) == len(files)
:param metadata: If not None, it is a list of :class:`MetaInformation` objects.
:param metadata: If not None, it is a list of :class:`Metadata` objects.
The idea is to use the metadata to determine where on the device to
put the book. len(metadata) == len(files). Apart from the regular
cover (path to cover), there may also be a thumbnail attribute, which should
@ -335,7 +335,7 @@ class DevicePlugin(Plugin):
the device.
:param locations: Result of a call to L{upload_books}
:param metadata: List of :class:`MetaInformation` objects, same as for
:param metadata: List of :class:`Metadata` objects, same as for
:meth:`upload_books`.
:param booklists: A tuple containing the result of calls to
(:meth:`books(oncard=None)`,
@ -411,6 +411,24 @@ class DevicePlugin(Plugin):
'''
raise NotImplementedError()
def set_plugboards(self, plugboards, pb_func):
'''
provide the driver the current set of plugboards and a function to
select a specific plugboard. This method is called immediately before
add_books and sync_booklists.
pb_func is a callable with the following signature::
def pb_func(device_name, format, plugboards)
You give it the current device name (either the class name or
DEVICE_PLUGBOARD_NAME), the format you are interested in (a 'real'
format or 'device_db'), and the plugboards (you were given those by
set_plugboards, the same place you got this method).
:return: None or a single plugboard instance.
'''
pass
class BookList(list):
'''

View File

@ -20,7 +20,7 @@ class IREXDR1000(USBMS):
# Ordered list of supported formats
# Be sure these have an entry in calibre.devices.mime
FORMATS = ['epub', 'mobi', 'prc', 'html', 'pdf', 'txt']
FORMATS = ['epub', 'mobi', 'prc', 'html', 'pdf', 'djvu', 'txt']
VENDOR_ID = [0x1e6b]
PRODUCT_ID = [0x001]

View File

@ -99,4 +99,30 @@ class MIBUK(USBMS):
VENDOR_NAME = 'LINUX'
WINDOWS_MAIN_MEM = 'WOLDERMIBUK'
class JETBOOK_MINI(USBMS):
'''
['0x4b8',
'0x507',
'0x100',
'ECTACO',
'ECTACO ATA/ATAPI Bridge (Bulk-Only)',
'Rev.0.20']
'''
FORMATS = ['fb2', 'txt']
gui_name = 'JetBook Mini'
name = 'JetBook Mini Device Interface'
description = _('Communicate with the JetBook Mini reader.')
author = 'Kovid Goyal'
VENDOR_ID = [0x4b8]
PRODUCT_ID = [0x507]
BCD = [0x100]
VENDOR_NAME = 'ECTACO'
WINDOWS_MAIN_MEM = '' # Matches PROD_
MAIN_MEMORY_VOLUME_LABEL = 'Jetbook Mini'
SUPPORTS_SUB_DIRS = True

View File

@ -4,37 +4,15 @@ __copyright__ = '2010, Timothy Legge <timlegge at gmail.com>'
'''
import os
import re
import time
from calibre.ebooks.metadata import MetaInformation
from calibre.constants import filesystem_encoding, preferred_encoding
from calibre import isbytestring
from calibre.devices.usbms.books import Book as Book_
class Book(MetaInformation):
class Book(Book_):
BOOK_ATTRS = ['lpath', 'size', 'mime', 'device_collections', '_new_book']
JSON_ATTRS = [
'lpath', 'title', 'authors', 'mime', 'size', 'tags', 'author_sort',
'title_sort', 'comments', 'category', 'publisher', 'series',
'series_index', 'rating', 'isbn', 'language', 'application_id',
'book_producer', 'lccn', 'lcc', 'ddc', 'rights', 'publication_type',
'uuid', 'device_collections',
]
def __init__(self, prefix, lpath, title, authors, mime, date, ContentType, thumbnail_name, size=None, other=None):
MetaInformation.__init__(self, '')
self.device_collections = []
self._new_book = False
self.path = os.path.join(prefix, lpath)
if os.sep == '\\':
self.path = self.path.replace('/', '\\')
self.lpath = lpath.replace('\\', '/')
else:
self.lpath = lpath
def __init__(self, prefix, lpath, title, authors, mime, date, ContentType,
thumbnail_name, size=None, other=None):
Book_.__init__(self, prefix, lpath)
self.title = title
if not authors:
@ -63,57 +41,7 @@ class Book(MetaInformation):
if other:
self.smart_update(other)
def __eq__(self, other):
return self.path == getattr(other, 'path', None)
@dynamic_property
def db_id(self):
doc = '''The database id in the application database that this file corresponds to'''
def fget(self):
match = re.search(r'_(\d+)$', self.lpath.rpartition('.')[0])
if match:
return int(match.group(1))
return None
return property(fget=fget, doc=doc)
@dynamic_property
def title_sorter(self):
doc = '''String to sort the title. If absent, title is returned'''
def fget(self):
return re.sub('^\s*A\s+|^\s*The\s+|^\s*An\s+', '', self.title).rstrip()
return property(doc=doc, fget=fget)
@dynamic_property
def thumbnail(self):
return None
def smart_update(self, other, replace_metadata=False):
'''
Merge the information in C{other} into self. In case of conflicts, the information
in C{other} takes precedence, unless the information in C{other} is NULL.
'''
MetaInformation.smart_update(self, other)
for attr in self.BOOK_ATTRS:
if hasattr(other, attr):
val = getattr(other, attr, None)
setattr(self, attr, val)
def to_json(self):
json = {}
for attr in self.JSON_ATTRS:
val = getattr(self, attr)
if isbytestring(val):
enc = filesystem_encoding if attr == 'lpath' else preferred_encoding
val = val.decode(enc, 'replace')
elif isinstance(val, (list, tuple)):
val = [x.decode(preferred_encoding, 'replace') if
isbytestring(x) else x for x in val]
json[attr] = val
return json
class ImageWrapper(object):
def __init__(self, image_path):
self.image_path = image_path
self.image_path = image_path

View File

@ -30,7 +30,7 @@ class KOBO(USBMS):
# Ordered list of supported formats
FORMATS = ['epub', 'pdf']
CAN_SET_METADATA = True
CAN_SET_METADATA = ['collections']
VENDOR_ID = [0x2237]
PRODUCT_ID = [0x4161]
@ -150,7 +150,7 @@ class KOBO(USBMS):
changed = False
for i, row in enumerate(cursor):
# self.report_progress((i+1) / float(numrows), _('Getting list of books on device...'))
# self.report_progress((i+1) / float(numrows), _('Getting list of books on device...'))
path = self.path_from_contentid(row[3], row[5], oncard)
mime = mime_type_ext(path_to_ext(row[3]))
@ -325,8 +325,9 @@ class KOBO(USBMS):
book = Book(prefix, lpath, '', '', '', '', '', '', other=info)
if book.size is None:
book.size = os.stat(self.normalize_path(path)).st_size
book._new_book = True # Must be before add_book
booklists[blist].add_book(book, replace_metadata=True)
b = booklists[blist].add_book(book, replace_metadata=True)
if b:
b._new_book = True
self.report_progress(1.0, _('Adding books to device metadata listing...'))
def contentid_from_path(self, path, ContentType):

View File

@ -108,6 +108,34 @@ class PDNOVEL(USBMS):
with open('%s.jpg' % os.path.join(path, filename), 'wb') as coverfile:
coverfile.write(coverdata[2])
class PDNOVEL_KOBO(PDNOVEL):
name = 'Pandigital Kobo device interface'
gui_name = 'PD Novel (Kobo)'
description = _('Communicate with the Pandigital Novel')
BCD = [0x222]
EBOOK_DIR_MAIN = 'eBooks/Kobo'
class VELOCITYMICRO(USBMS):
name = 'VelocityMicro device interface'
gui_name = 'VelocityMicro'
description = _('Communicate with the VelocityMicro')
author = 'Kovid Goyal'
supported_platforms = ['windows', 'linux', 'osx']
FORMATS = ['epub', 'pdb', 'txt', 'html', 'pdf']
VENDOR_ID = [0x18d1]
PRODUCT_ID = [0xb015]
BCD = [0x224]
VENDOR_NAME = 'ANDROID'
WINDOWS_MAIN_MEM = WINDOWS_CARD_A_MEM = '__UMS_COMPOSITE'
EBOOK_DIR_MAIN = 'eBooks'
SUPPORTS_SUB_DIRS = False
class GEMEI(USBMS):
name = 'Gemei Device Interface'
gui_name = 'GM2000'

View File

@ -27,7 +27,7 @@ class PRS505(USBMS):
FORMATS = ['epub', 'lrf', 'lrx', 'rtf', 'pdf', 'txt']
CAN_SET_METADATA = True
CAN_SET_METADATA = ['title', 'authors', 'collections']
VENDOR_ID = [0x054c] #: SONY Vendor Id
PRODUCT_ID = [0x031e]
@ -63,6 +63,9 @@ class PRS505(USBMS):
'series, tags, authors'
EXTRA_CUSTOMIZATION_DEFAULT = ', '.join(['series', 'tags'])
plugboard = None
plugboard_func = None
def windows_filter_pnp_id(self, pnp_id):
return '_LAUNCHER' in pnp_id
@ -150,7 +153,12 @@ class PRS505(USBMS):
else:
collections = []
debug_print('PRS505: collection fields:', collections)
c.update(blists, collections)
pb = None
if self.plugboard_func:
pb = self.plugboard_func(self.__class__.__name__,
'device_db', self.plugboards)
debug_print('PRS505: use plugboards', pb)
c.update(blists, collections, pb)
c.write()
USBMS.sync_booklists(self, booklists, end_session=end_session)
@ -163,3 +171,6 @@ class PRS505(USBMS):
c.write()
debug_print('PRS505: finished rebuild_collections')
def set_plugboards(self, plugboards, pb_func):
self.plugboards = plugboards
self.plugboard_func = pb_func

View File

@ -325,12 +325,6 @@ class XMLCache(object):
for book in bl:
record = lpath_map.get(book.lpath, None)
if record is not None:
title = record.get('title', None)
if title is not None and title != book.title:
debug_print('Renaming title', book.title, 'to', title)
book.title = title
# Don't set the author, because the reader strips all but
# the first author.
for thumbnail in record.xpath(
'descendant::*[local-name()="thumbnail"]'):
for img in thumbnail.xpath(
@ -350,7 +344,7 @@ class XMLCache(object):
# }}}
# Update XML from JSON {{{
def update(self, booklists, collections_attributes):
def update(self, booklists, collections_attributes, plugboard):
debug_print('Starting update', collections_attributes)
use_tz_var = False
for i, booklist in booklists.items():
@ -365,8 +359,14 @@ class XMLCache(object):
record = lpath_map.get(book.lpath, None)
if record is None:
record = self.create_text_record(root, i, book.lpath)
if plugboard is not None:
newmi = book.deepcopy_metadata()
newmi.template_to_attribute(book, plugboard)
newmi.set('_new_book', getattr(book, '_new_book', False))
else:
newmi = book
(gtz_count, ltz_count, use_tz_var) = \
self.update_text_record(record, book, path, i,
self.update_text_record(record, newmi, path, i,
gtz_count, ltz_count, use_tz_var)
# Ensure the collections in the XML database are recorded for
# this book

View File

@ -6,29 +6,18 @@ __docformat__ = 'restructuredtext en'
import os, re, time, sys
from calibre.ebooks.metadata import MetaInformation
from calibre.ebooks.metadata.book.base import Metadata
from calibre.devices.mime import mime_type_ext
from calibre.devices.interface import BookList as _BookList
from calibre.constants import filesystem_encoding, preferred_encoding
from calibre.constants import preferred_encoding
from calibre import isbytestring
from calibre.utils.config import prefs
class Book(MetaInformation):
BOOK_ATTRS = ['lpath', 'size', 'mime', 'device_collections', '_new_book']
JSON_ATTRS = [
'lpath', 'title', 'authors', 'mime', 'size', 'tags', 'author_sort',
'title_sort', 'comments', 'category', 'publisher', 'series',
'series_index', 'rating', 'isbn', 'language', 'application_id',
'book_producer', 'lccn', 'lcc', 'ddc', 'rights', 'publication_type',
'uuid',
]
from calibre.utils.config import prefs, tweaks
class Book(Metadata):
def __init__(self, prefix, lpath, size=None, other=None):
from calibre.ebooks.metadata.meta import path_to_ext
MetaInformation.__init__(self, '')
Metadata.__init__(self, '')
self._new_book = False
self.device_collections = []
@ -72,32 +61,6 @@ class Book(MetaInformation):
def thumbnail(self):
return None
def smart_update(self, other, replace_metadata=False):
'''
Merge the information in C{other} into self. In case of conflicts, the information
in C{other} takes precedence, unless the information in C{other} is NULL.
'''
MetaInformation.smart_update(self, other, replace_metadata)
for attr in self.BOOK_ATTRS:
if hasattr(other, attr):
val = getattr(other, attr, None)
setattr(self, attr, val)
def to_json(self):
json = {}
for attr in self.JSON_ATTRS:
val = getattr(self, attr)
if isbytestring(val):
enc = filesystem_encoding if attr == 'lpath' else preferred_encoding
val = val.decode(enc, 'replace')
elif isinstance(val, (list, tuple)):
val = [x.decode(preferred_encoding, 'replace') if
isbytestring(x) else x for x in val]
json[attr] = val
return json
class BookList(_BookList):
def __init__(self, oncard, prefix, settings):
@ -108,17 +71,21 @@ class BookList(_BookList):
return False
def add_book(self, book, replace_metadata):
'''
Add the book to the booklist, if needed. Return None if the book is
already there and not updated, otherwise return the book.
'''
try:
b = self.index(book)
except (ValueError, IndexError):
b = None
if b is None:
self.append(book)
return True
return book
if replace_metadata:
self[b].smart_update(book, replace_metadata=True)
return True
return False
return self[b]
return None
def remove_book(self, book):
self.remove(book)
@ -131,11 +98,30 @@ class CollectionsBookList(BookList):
def supports_collections(self):
return True
def compute_category_name(self, attr, category, field_meta):
renames = tweaks['sony_collection_renaming_rules']
attr_name = renames.get(attr, None)
if attr_name is None:
if field_meta['is_custom']:
attr_name = '(%s)'%field_meta['name']
else:
attr_name = ''
elif attr_name != '':
attr_name = '(%s)'%attr_name
cat_name = '%s %s'%(category, attr_name)
return cat_name.strip()
def get_collections(self, collection_attributes):
from calibre.devices.usbms.driver import debug_print
debug_print('Starting get_collections:', prefs['manage_device_metadata'])
debug_print('Renaming rules:', tweaks['sony_collection_renaming_rules'])
# Complexity: we can use renaming rules only when using automatic
# management. Otherwise we don't always have the metadata to make the
# right decisions
use_renaming_rules = prefs['manage_device_metadata'] == 'on_connect'
collections = {}
series_categories = set([])
# This map of sets is used to avoid linear searches when testing for
# book equality
collections_lpaths = {}
@ -163,39 +149,72 @@ class CollectionsBookList(BookList):
attrs = collection_attributes
for attr in attrs:
attr = attr.strip()
val = getattr(book, attr, None)
# If attr is device_collections, then we cannot use
# format_field, because we don't know the fields where the
# values came from.
if attr == 'device_collections':
doing_dc = True
val = book.device_collections # is a list
else:
doing_dc = False
ign, val, orig_val, fm = book.format_field_extended(attr)
if not val: continue
if isbytestring(val):
val = val.decode(preferred_encoding, 'replace')
if isinstance(val, (list, tuple)):
val = list(val)
elif isinstance(val, unicode):
elif fm['datatype'] == 'series':
val = [orig_val]
elif fm['datatype'] == 'text' and fm['is_multiple']:
val = orig_val
else:
val = [val]
for category in val:
if attr == 'tags' and len(category) > 1 and \
category[0] == '[' and category[-1] == ']':
is_series = False
if doing_dc:
# Attempt to determine if this value is a series by
# comparing it to the series name.
if category == book.series:
is_series = True
elif fm['is_custom']: # is a custom field
if fm['datatype'] == 'text' and len(category) > 1 and \
category[0] == '[' and category[-1] == ']':
continue
if fm['datatype'] == 'series':
is_series = True
else: # is a standard field
if attr == 'tags' and len(category) > 1 and \
category[0] == '[' and category[-1] == ']':
continue
if attr == 'series' or \
('series' in collection_attributes and
book.get('series', None) == category):
is_series = True
if use_renaming_rules:
cat_name = self.compute_category_name(attr, category, fm)
else:
cat_name = category
if cat_name not in collections:
collections[cat_name] = []
collections_lpaths[cat_name] = set()
if lpath in collections_lpaths[cat_name]:
continue
if category not in collections:
collections[category] = []
collections_lpaths[category] = set()
if lpath not in collections_lpaths[category]:
collections_lpaths[category].add(lpath)
collections[category].append(book)
if attr == 'series' or \
('series' in collection_attributes and
getattr(book, 'series', None) == category):
series_categories.add(category)
collections_lpaths[cat_name].add(lpath)
if is_series:
collections[cat_name].append(
(book, book.get(attr+'_index', sys.maxint)))
else:
collections[cat_name].append(
(book, book.get('title_sort', 'zzzz')))
# Sort collections
result = {}
for category, books in collections.items():
def tgetter(x):
return getattr(x, 'title_sort', 'zzzz')
books.sort(cmp=lambda x,y:cmp(tgetter(x), tgetter(y)))
if category in series_categories:
# Ensures books are sub sorted by title
def getter(x):
return getattr(x, 'series_index', sys.maxint)
books.sort(cmp=lambda x,y:cmp(getter(x), getter(y)))
return collections
books.sort(cmp=lambda x,y:cmp(x[1], y[1]))
result[category] = [x[0] for x in books]
return result
def rebuild_collections(self, booklist, oncard):
'''

View File

@ -829,12 +829,14 @@ class Device(DeviceConfig, DevicePlugin):
ext = os.path.splitext(fname)[1]
from calibre.library.save_to_disk import get_components
from calibre.library.save_to_disk import config
opts = config().parse()
if not isinstance(template, unicode):
template = template.decode('utf-8')
app_id = str(getattr(mdata, 'application_id', ''))
# The db id will be in the created filename
extra_components = get_components(template, mdata, fname,
length=250-len(app_id)-1)
timefmt=opts.send_timefmt, length=250-len(app_id)-1)
if not extra_components:
extra_components.append(sanitize(self.filename_callback(fname,
mdata)))

View File

@ -13,7 +13,6 @@ for a particular device.
import os
import re
import time
import json
from itertools import cycle
from calibre import prints, isbytestring
@ -21,6 +20,7 @@ from calibre.constants import filesystem_encoding, DEBUG
from calibre.devices.usbms.cli import CLI
from calibre.devices.usbms.device import Device
from calibre.devices.usbms.books import BookList, Book
from calibre.ebooks.metadata.book.json_codec import JsonCodec
BASE_TIME = None
def debug_print(*args):
@ -50,7 +50,7 @@ class USBMS(CLI, Device):
book_class = Book
FORMATS = []
CAN_SET_METADATA = False
CAN_SET_METADATA = []
METADATA_CACHE = 'metadata.calibre'
def get_device_information(self, end_session=True):
@ -242,8 +242,9 @@ class USBMS(CLI, Device):
book = self.book_class(prefix, lpath, other=info)
if book.size is None:
book.size = os.stat(self.normalize_path(path)).st_size
book._new_book = True # Must be before add_book
booklists[blist].add_book(book, replace_metadata=True)
b = booklists[blist].add_book(book, replace_metadata=True)
if b:
b._new_book = True
self.report_progress(1.0, _('Adding books to device metadata listing...'))
debug_print('USBMS: finished adding metadata')
@ -288,6 +289,7 @@ class USBMS(CLI, Device):
# at the end just before the return
def sync_booklists(self, booklists, end_session=True):
debug_print('USBMS: starting sync_booklists')
json_codec = JsonCodec()
if not os.path.exists(self.normalize_path(self._main_prefix)):
os.makedirs(self.normalize_path(self._main_prefix))
@ -296,10 +298,8 @@ class USBMS(CLI, Device):
if prefix is not None and isinstance(booklists[listid], self.booklist_class):
if not os.path.exists(prefix):
os.makedirs(self.normalize_path(prefix))
js = [item.to_json() for item in booklists[listid] if
hasattr(item, 'to_json')]
with open(self.normalize_path(os.path.join(prefix, self.METADATA_CACHE)), 'wb') as f:
f.write(json.dumps(js, indent=2, encoding='utf-8'))
json_codec.encode_to_file(f, booklists[listid])
write_prefix(self._main_prefix, 0)
write_prefix(self._card_a_prefix, 1)
write_prefix(self._card_b_prefix, 2)
@ -345,19 +345,13 @@ class USBMS(CLI, Device):
@classmethod
def parse_metadata_cache(cls, bl, prefix, name):
# bl = cls.booklist_class()
js = []
json_codec = JsonCodec()
need_sync = False
cache_file = cls.normalize_path(os.path.join(prefix, name))
if os.access(cache_file, os.R_OK):
try:
with open(cache_file, 'rb') as f:
js = json.load(f, encoding='utf-8')
for item in js:
book = cls.book_class(prefix, item.get('lpath', None))
for key in item.keys():
setattr(book, key, item[key])
bl.append(book)
json_codec.decode_from_file(f, bl, cls.book_class, prefix)
except:
import traceback
traceback.print_exc()
@ -392,7 +386,7 @@ class USBMS(CLI, Device):
@classmethod
def book_from_path(cls, prefix, lpath):
from calibre.ebooks.metadata import MetaInformation
from calibre.ebooks.metadata.book.base import Metadata
if cls.settings().read_metadata or cls.MUST_READ_METADATA:
mi = cls.metadata_from_path(cls.normalize_path(os.path.join(prefix, lpath)))
@ -401,7 +395,7 @@ class USBMS(CLI, Device):
mi = metadata_from_filename(cls.normalize_path(os.path.basename(lpath)),
cls.build_template_regexp())
if mi is None:
mi = MetaInformation(os.path.splitext(os.path.basename(lpath))[0],
mi = Metadata(os.path.splitext(os.path.basename(lpath))[0],
[_('Unknown')])
size = os.stat(cls.normalize_path(os.path.join(prefix, lpath))).st_size
book = cls.book_class(prefix, lpath, other=mi, size=size)

View File

@ -15,7 +15,6 @@ from calibre.utils.chm.chmlib import (
chm_enumerate,
)
from calibre.utils.config import OptionParser
from calibre.ebooks.metadata.toc import TOC
from calibre.ebooks.chardet import xml_to_unicode
@ -37,41 +36,6 @@ def check_empty(s, rex = re.compile(r'\S')):
return rex.search(s) is None
def option_parser():
parser = OptionParser(usage=_('%prog [options] mybook.chm'))
parser.add_option('--output-dir', '-d', default='.', help=_('Output directory. Defaults to current directory'), dest='output')
parser.add_option('--verbose', default=False, action='store_true', dest='verbose')
parser.add_option("-t", "--title", action="store", type="string", \
dest="title", help=_("Set the book title"))
parser.add_option('--title-sort', action='store', type='string', default=None,
dest='title_sort', help=_('Set sort key for the title'))
parser.add_option("-a", "--author", action="store", type="string", \
dest="author", help=_("Set the author"))
parser.add_option('--author-sort', action='store', type='string', default=None,
dest='author_sort', help=_('Set sort key for the author'))
parser.add_option("-c", "--category", action="store", type="string", \
dest="category", help=_("The category this book belongs"
" to. E.g.: History"))
parser.add_option("--thumbnail", action="store", type="string", \
dest="thumbnail", help=_("Path to a graphic that will be"
" set as this files' thumbnail"))
parser.add_option("--comment", action="store", type="string", \
dest="freetext", help=_("Path to a txt file containing a comment."))
parser.add_option("--get-thumbnail", action="store_true", \
dest="get_thumbnail", default=False, \
help=_("Extract thumbnail from LRF file"))
parser.add_option('--publisher', default=None, help=_('Set the publisher'))
parser.add_option('--classification', default=None, help=_('Set the book classification'))
parser.add_option('--creator', default=None, help=_('Set the book creator'))
parser.add_option('--producer', default=None, help=_('Set the book producer'))
parser.add_option('--get-cover', action='store_true', default=False,
help=_('Extract cover from LRF file. Note that the LRF format has no defined cover, so we use some heuristics to guess the cover.'))
parser.add_option('--bookid', action='store', type='string', default=None,
dest='book_id', help=_('Set book ID'))
parser.add_option('--font-delta', action='store', type='int', default=0,
dest='font_delta', help=_('Set font delta'))
return parser
class CHMError(Exception):
pass
@ -151,7 +115,8 @@ class CHMReader(CHMFile):
continue
raise
self._extracted = True
files = os.listdir(output_dir)
files = [x for x in os.listdir(output_dir) if
os.path.isfile(os.path.join(output_dir, x))]
if self.hhc_path not in files:
for f in files:
if f.lower() == self.hhc_path.lower():

View File

@ -701,13 +701,13 @@ OptionRecommendation(name='timestamp',
self.opts.read_metadata_from_opf)
opf = OPF(open(self.opts.read_metadata_from_opf, 'rb'),
os.path.dirname(self.opts.read_metadata_from_opf))
mi = MetaInformation(opf)
mi = opf.to_book_metadata()
self.opts_to_mi(mi)
if mi.cover:
if mi.cover.startswith('http:') or mi.cover.startswith('https:'):
mi.cover = self.download_cover(mi.cover)
ext = mi.cover.rpartition('.')[-1].lower().strip()
if ext not in ('png', 'jpg', 'jpeg'):
if ext not in ('png', 'jpg', 'jpeg', 'gif'):
ext = 'jpg'
mi.cover_data = (ext, open(mi.cover, 'rb').read())
mi.cover = None

View File

@ -184,14 +184,14 @@ class Dehyphenator(object):
wraptags = match.group('wraptags')
except:
wraptags = ''
hyphenated = str(firsthalf) + "-" + str(secondhalf)
dehyphenated = str(firsthalf) + str(secondhalf)
hyphenated = unicode(firsthalf) + "-" + unicode(secondhalf)
dehyphenated = unicode(firsthalf) + unicode(secondhalf)
lookupword = self.removesuffixes.sub('', dehyphenated)
if self.prefixes.match(firsthalf) is None:
lookupword = self.removeprefix.sub('', lookupword)
#print "lookup word is: "+str(lookupword)+", orig is: " + str(hyphenated)
try:
searchresult = self.html.find(str.lower(lookupword))
searchresult = self.html.find(lookupword.lower())
except:
return hyphenated
if self.format == 'html_cleanup':

View File

@ -22,18 +22,21 @@ class PreProcessor(object):
title = match.group('title')
if not title:
self.html_preprocess_sections = self.html_preprocess_sections + 1
self.log("found " + str(self.html_preprocess_sections) + " chapters. - " + str(chap))
self.log("found " + unicode(self.html_preprocess_sections) +
" chapters. - " + unicode(chap))
return '<h2>'+chap+'</h2>\n'
else:
self.html_preprocess_sections = self.html_preprocess_sections + 1
self.log("found " + str(self.html_preprocess_sections) + " chapters & titles. - " + str(chap) + ", " + str(title))
self.log("found " + unicode(self.html_preprocess_sections) +
" chapters & titles. - " + unicode(chap) + ", " + unicode(title))
return '<h2>'+chap+'</h2>\n<h3>'+title+'</h3>\n'
def chapter_break(self, match):
chap = match.group('section')
styles = match.group('styles')
self.html_preprocess_sections = self.html_preprocess_sections + 1
self.log("marked " + str(self.html_preprocess_sections) + " section markers based on punctuation. - " + str(chap))
self.log("marked " + unicode(self.html_preprocess_sections) +
" section markers based on punctuation. - " + unicode(chap))
return '<'+styles+' style="page-break-before:always">'+chap
def insert_indent(self, match):
@ -63,7 +66,8 @@ class PreProcessor(object):
line_end = line_end_ere.findall(raw)
tot_htm_ends = len(htm_end)
tot_ln_fds = len(line_end)
self.log("There are " + str(tot_ln_fds) + " total Line feeds, and " + str(tot_htm_ends) + " marked up endings")
self.log("There are " + unicode(tot_ln_fds) + " total Line feeds, and " +
unicode(tot_htm_ends) + " marked up endings")
if percent > 1:
percent = 1
@ -71,7 +75,7 @@ class PreProcessor(object):
percent = 0
min_lns = tot_ln_fds * percent
self.log("There must be fewer than " + str(min_lns) + " unmarked lines to add markup")
self.log("There must be fewer than " + unicode(min_lns) + " unmarked lines to add markup")
if min_lns > tot_htm_ends:
return True
@ -112,7 +116,7 @@ class PreProcessor(object):
txtindent = re.compile(ur'<p(?P<formatting>[^>]*)>\s*(?P<span>(<span[^>]*>\s*)+)?\s*(\u00a0){2,}', re.IGNORECASE)
html = txtindent.sub(self.insert_indent, html)
if self.found_indents > 1:
self.log("replaced "+str(self.found_indents)+ " nbsp indents with inline styles")
self.log("replaced "+unicode(self.found_indents)+ " nbsp indents with inline styles")
# remove remaining non-breaking spaces
html = re.sub(ur'\u00a0', ' ', html)
# Get rid of empty <o:p> tags to simplify other processing
@ -131,7 +135,8 @@ class PreProcessor(object):
lines = linereg.findall(html)
blanks_between_paragraphs = False
if len(lines) > 1:
self.log("There are " + str(len(blanklines)) + " blank lines. " + str(float(len(blanklines)) / float(len(lines))) + " percent blank")
self.log("There are " + unicode(len(blanklines)) + " blank lines. " +
unicode(float(len(blanklines)) / float(len(lines))) + " percent blank")
if float(len(blanklines)) / float(len(lines)) > 0.40 and getattr(self.extra_opts,
'remove_paragraph_spacing', False):
self.log("deleting blank lines")
@ -170,20 +175,20 @@ class PreProcessor(object):
#print chapter_marker
heading = re.compile('<h[1-3][^>]*>', re.IGNORECASE)
self.html_preprocess_sections = len(heading.findall(html))
self.log("found " + str(self.html_preprocess_sections) + " pre-existing headings")
self.log("found " + unicode(self.html_preprocess_sections) + " pre-existing headings")
#
# Start with most typical chapter headings, get more aggressive until one works
if self.html_preprocess_sections < 10:
chapdetect = re.compile(r'%s' % chapter_marker, re.IGNORECASE)
html = chapdetect.sub(self.chapter_head, html)
if self.html_preprocess_sections < 10:
self.log("not enough chapters, only " + str(self.html_preprocess_sections) + ", trying numeric chapters")
self.log("not enough chapters, only " + unicode(self.html_preprocess_sections) + ", trying numeric chapters")
chapter_marker = lookahead+chapter_line_open+chapter_header_open+numeric_chapters+chapter_header_close+chapter_line_close+blank_lines+opt_title_open+title_line_open+title_header_open+default_title+title_header_close+title_line_close+opt_title_close
chapdetect2 = re.compile(r'%s' % chapter_marker, re.IGNORECASE)
html = chapdetect2.sub(self.chapter_head, html)
if self.html_preprocess_sections < 10:
self.log("not enough chapters, only " + str(self.html_preprocess_sections) + ", trying with uppercase words")
self.log("not enough chapters, only " + unicode(self.html_preprocess_sections) + ", trying with uppercase words")
chapter_marker = lookahead+chapter_line_open+chapter_header_open+uppercase_chapters+chapter_header_close+chapter_line_close+blank_lines+opt_title_open+title_line_open+title_header_open+default_title+title_header_close+title_line_close+opt_title_close
chapdetect2 = re.compile(r'%s' % chapter_marker, re.UNICODE)
html = chapdetect2.sub(self.chapter_head, html)
@ -207,11 +212,11 @@ class PreProcessor(object):
# more of the lines break in the same region of the document then unwrapping is required
docanalysis = DocAnalysis(format, html)
hardbreaks = docanalysis.line_histogram(.50)
self.log("Hard line breaks check returned "+str(hardbreaks))
self.log("Hard line breaks check returned "+unicode(hardbreaks))
# Calculate Length
unwrap_factor = getattr(self.extra_opts, 'html_unwrap_factor', 0.4)
length = docanalysis.line_length(unwrap_factor)
self.log("*** Median line length is " + str(length) + ", calculated with " + format + " format ***")
self.log("*** Median line length is " + unicode(length) + ", calculated with " + format + " format ***")
# only go through unwrapping code if the histogram shows unwrapping is required or if the user decreased the default unwrap_factor
if hardbreaks or unwrap_factor < 0.4:
self.log("Unwrapping required, unwrapping Lines")
@ -240,7 +245,8 @@ class PreProcessor(object):
# If still no sections after unwrapping mark split points on lines with no punctuation
if self.html_preprocess_sections < 10:
self.log("Looking for more split points based on punctuation, currently have " + str(self.html_preprocess_sections))
self.log("Looking for more split points based on punctuation,"
" currently have " + unicode(self.html_preprocess_sections))
chapdetect3 = re.compile(r'<(?P<styles>(p|div)[^>]*)>\s*(?P<section>(<span[^>]*>)?\s*(<[ibu][^>]*>){0,2}\s*(<span[^>]*>)?\s*(<[ibu][^>]*>){0,2}\s*(<span[^>]*>)?\s*.?(?=[a-z#\-*\s]+<)([a-z#-*]+\s*){1,5}\s*\s*(</span>)?(</[ibu]>){0,2}\s*(</span>)?\s*(</[ibu]>){0,2}\s*(</span>)?\s*</(p|div)>)', re.IGNORECASE)
html = chapdetect3.sub(self.chapter_break, html)
# search for places where a first or second level heading is immediately followed by another

View File

@ -43,7 +43,11 @@ class Epubcheck(ePubFixer):
default=default)
except:
raise InvalidEpub('Invalid date set in OPF', raw)
sval = ts.strftime('%Y-%m-%d')
try:
sval = ts.strftime('%Y-%m-%d')
except:
from calibre import strftime
sval = strftime('%Y-%m-%d', ts.timetuple())
if sval != raw:
self.log.error(
'OPF contains date', raw, 'that epubcheck does not like')

View File

@ -117,7 +117,8 @@ class EPUBInput(InputFormatPlugin):
encfile = os.path.abspath(os.path.join('META-INF', 'encryption.xml'))
opf = None
for f in walk(u'.'):
if f.lower().endswith('.opf') and '__MACOSX' not in f:
if f.lower().endswith('.opf') and '__MACOSX' not in f and \
not os.path.basename(f).startswith('.'):
opf = os.path.abspath(f)
break
path = getattr(stream, 'name', 'stream')

View File

@ -10,10 +10,9 @@ import os, mimetypes, sys, re
from urllib import unquote, quote
from urlparse import urlparse
from calibre import relpath, prints
from calibre import relpath
from calibre.utils.config import tweaks
from calibre.utils.date import isoformat
_author_pat = re.compile(',?\s+(and|with)\s+', re.IGNORECASE)
def string_to_authors(raw):
@ -45,7 +44,15 @@ def author_to_author_sort(author):
def authors_to_sort_string(authors):
return ' & '.join(map(author_to_author_sort, authors))
_title_pat = re.compile('^(A|The|An)\s+', re.IGNORECASE)
try:
_title_pat = re.compile(tweaks.get('title_sort_articles',
r'^(A|The|An)\s+'), re.IGNORECASE)
except:
print 'Error in title sort pattern'
import traceback
traceback.print_exc()
_title_pat = re.compile('^(A|The|An)\s+', re.IGNORECASE)
_ignore_starts = u'\'"'+u''.join(unichr(x) for x in range(0x2018, 0x201e)+[0x2032, 0x2033])
def title_sort(title):
@ -221,214 +228,18 @@ class ResourceCollection(object):
class MetaInformation(object):
'''Convenient encapsulation of book metadata'''
@staticmethod
def copy(mi):
ans = MetaInformation(mi.title, mi.authors)
for attr in ('author_sort', 'title_sort', 'comments', 'category',
'publisher', 'series', 'series_index', 'rating',
'isbn', 'tags', 'cover_data', 'application_id', 'guide',
'manifest', 'spine', 'toc', 'cover', 'language',
'book_producer', 'timestamp', 'lccn', 'lcc', 'ddc',
'author_sort_map',
'pubdate', 'rights', 'publication_type', 'uuid'):
if hasattr(mi, attr):
setattr(ans, attr, getattr(mi, attr))
def __init__(self, title, authors=(_('Unknown'),)):
'''
def MetaInformation(title, authors=(_('Unknown'),)):
''' Convenient encapsulation of book metadata, needed for compatibility
@param title: title or ``_('Unknown')`` or a MetaInformation object
@param authors: List of strings or []
'''
mi = None
if hasattr(title, 'title') and hasattr(title, 'authors'):
mi = title
title = mi.title
authors = mi.authors
self.title = title
self.author = list(authors) if authors else []# Needed for backward compatibility
#: List of strings or []
self.authors = list(authors) if authors else []
self.tags = getattr(mi, 'tags', [])
#: mi.cover_data = (ext, data)
self.cover_data = getattr(mi, 'cover_data', (None, None))
self.author_sort_map = getattr(mi, 'author_sort_map', {})
for x in ('author_sort', 'title_sort', 'comments', 'category', 'publisher',
'series', 'series_index', 'rating', 'isbn', 'language',
'application_id', 'manifest', 'toc', 'spine', 'guide', 'cover',
'book_producer', 'timestamp', 'lccn', 'lcc', 'ddc', 'pubdate',
'rights', 'publication_type', 'uuid',
):
setattr(self, x, getattr(mi, x, None))
def print_all_attributes(self):
for x in ('title','author', 'author_sort', 'title_sort', 'comments', 'category', 'publisher',
'series', 'series_index', 'tags', 'rating', 'isbn', 'language',
'application_id', 'manifest', 'toc', 'spine', 'guide', 'cover',
'book_producer', 'timestamp', 'lccn', 'lcc', 'ddc', 'pubdate',
'rights', 'publication_type', 'uuid', 'author_sort_map'
):
prints(x, getattr(self, x, 'None'))
def smart_update(self, mi, replace_metadata=False):
'''
Merge the information in C{mi} into self. In case of conflicts, the
information in C{mi} takes precedence, unless the information in mi is
NULL. If replace_metadata is True, then the information in mi always
takes precedence.
'''
if mi.title and mi.title != _('Unknown'):
self.title = mi.title
if mi.authors and mi.authors[0] != _('Unknown'):
self.authors = mi.authors
for attr in ('author_sort', 'title_sort', 'category',
'publisher', 'series', 'series_index', 'rating',
'isbn', 'application_id', 'manifest', 'spine', 'toc',
'cover', 'guide', 'book_producer',
'timestamp', 'lccn', 'lcc', 'ddc', 'pubdate', 'rights',
'publication_type', 'uuid'):
if replace_metadata:
setattr(self, attr, getattr(mi, attr, 1.0 if \
attr == 'series_index' else None))
elif hasattr(mi, attr):
val = getattr(mi, attr)
if val is not None:
setattr(self, attr, val)
if replace_metadata:
self.tags = mi.tags
elif mi.tags:
self.tags += mi.tags
self.tags = list(set(self.tags))
if mi.author_sort_map:
self.author_sort_map.update(mi.author_sort_map)
if getattr(mi, 'cover_data', False):
other_cover = mi.cover_data[-1]
self_cover = self.cover_data[-1] if self.cover_data else ''
if not self_cover: self_cover = ''
if not other_cover: other_cover = ''
if len(other_cover) > len(self_cover):
self.cover_data = mi.cover_data
if replace_metadata:
self.comments = getattr(mi, 'comments', '')
else:
my_comments = getattr(self, 'comments', '')
other_comments = getattr(mi, 'comments', '')
if not my_comments:
my_comments = ''
if not other_comments:
other_comments = ''
if len(other_comments.strip()) > len(my_comments.strip()):
self.comments = other_comments
other_lang = getattr(mi, 'language', None)
if other_lang and other_lang.lower() != 'und':
self.language = other_lang
def format_series_index(self):
try:
x = float(self.series_index)
except ValueError:
x = 1
return fmt_sidx(x)
def authors_from_string(self, raw):
self.authors = string_to_authors(raw)
def format_authors(self):
return authors_to_string(self.authors)
def format_tags(self):
return u', '.join([unicode(t) for t in self.tags])
def format_rating(self):
return unicode(self.rating)
def __unicode__(self):
ans = []
def fmt(x, y):
ans.append(u'%-20s: %s'%(unicode(x), unicode(y)))
fmt('Title', self.title)
if self.title_sort:
fmt('Title sort', self.title_sort)
if self.authors:
fmt('Author(s)', authors_to_string(self.authors) + \
((' [' + self.author_sort + ']') if self.author_sort else ''))
if self.publisher:
fmt('Publisher', self.publisher)
if getattr(self, 'book_producer', False):
fmt('Book Producer', self.book_producer)
if self.category:
fmt('Category', self.category)
if self.comments:
fmt('Comments', self.comments)
if self.isbn:
fmt('ISBN', self.isbn)
if self.tags:
fmt('Tags', u', '.join([unicode(t) for t in self.tags]))
if self.series:
fmt('Series', self.series + ' #%s'%self.format_series_index())
if self.language:
fmt('Language', self.language)
if self.rating is not None:
fmt('Rating', self.rating)
if self.timestamp is not None:
fmt('Timestamp', isoformat(self.timestamp))
if self.pubdate is not None:
fmt('Published', isoformat(self.pubdate))
if self.rights is not None:
fmt('Rights', unicode(self.rights))
if self.lccn:
fmt('LCCN', unicode(self.lccn))
if self.lcc:
fmt('LCC', unicode(self.lcc))
if self.ddc:
fmt('DDC', unicode(self.ddc))
return u'\n'.join(ans)
def to_html(self):
ans = [(_('Title'), unicode(self.title))]
ans += [(_('Author(s)'), (authors_to_string(self.authors) if self.authors else _('Unknown')))]
ans += [(_('Publisher'), unicode(self.publisher))]
ans += [(_('Producer'), unicode(self.book_producer))]
ans += [(_('Comments'), unicode(self.comments))]
ans += [('ISBN', unicode(self.isbn))]
if self.lccn:
ans += [('LCCN', unicode(self.lccn))]
if self.lcc:
ans += [('LCC', unicode(self.lcc))]
if self.ddc:
ans += [('DDC', unicode(self.ddc))]
ans += [(_('Tags'), u', '.join([unicode(t) for t in self.tags]))]
if self.series:
ans += [(_('Series'), unicode(self.series)+ ' #%s'%self.format_series_index())]
ans += [(_('Language'), unicode(self.language))]
if self.timestamp is not None:
ans += [(_('Timestamp'), unicode(self.timestamp.isoformat(' ')))]
if self.pubdate is not None:
ans += [(_('Published'), unicode(self.pubdate.isoformat(' ')))]
if self.rights is not None:
ans += [(_('Rights'), unicode(self.rights))]
for i, x in enumerate(ans):
ans[i] = u'<tr><td><b>%s</b></td><td>%s</td></tr>'%x
return u'<table>%s</table>'%u'\n'.join(ans)
def __str__(self):
return self.__unicode__().encode('utf-8')
def __nonzero__(self):
return bool(self.title or self.author or self.comments or self.tags)
'''
from calibre.ebooks.metadata.book.base import Metadata
mi = None
if hasattr(title, 'title') and hasattr(title, 'authors'):
mi = title
title = mi.title
authors = mi.authors
return Metadata(title, authors, other=mi)
def check_isbn10(isbn):
try:

View File

@ -11,48 +11,45 @@ an empty list/dictionary for complex types and (None, None) for cover_data
'''
SOCIAL_METADATA_FIELDS = frozenset([
'tags', # Ordered list
# A floating point number between 0 and 10
'rating',
# A simple HTML enabled string
'comments',
# A simple string
'series',
# A floating point number
'series_index',
'tags', # Ordered list
'rating', # A floating point number between 0 and 10
'comments', # A simple HTML enabled string
'series', # A simple string
'series_index', # A floating point number
# Of the form { scheme1:value1, scheme2:value2}
# For example: {'isbn':'123456789', 'doi':'xxxx', ... }
'classifiers',
'isbn', # Pseudo field for convenience, should get/set isbn classifier
])
'''
The list of names that convert to classifiers when in get and set.
'''
TOP_LEVEL_CLASSIFIERS = frozenset([
'isbn',
])
PUBLICATION_METADATA_FIELDS = frozenset([
# title must never be None. Should be _('Unknown')
'title',
'title', # title must never be None. Should be _('Unknown')
# Pseudo field that can be set, but if not set is auto generated
# from title and languages
'title_sort',
# Ordered list of authors. Must never be None, can be [_('Unknown')]
'authors',
# Map of sort strings for each author
'author_sort_map',
'authors', # Ordered list. Must never be None, can be [_('Unknown')]
'author_sort_map', # Map of sort strings for each author
# Pseudo field that can be set, but if not set is auto generated
# from authors and languages
'author_sort',
'book_producer',
# Dates and times must be timezone aware
'timestamp',
'timestamp', # Dates and times must be timezone aware
'pubdate',
'rights',
# So far only known publication type is periodical:calibre
# If None, means book
'publication_type',
# A UUID usually of type 4
'uuid',
'languages', # ordered list
# Simple string, no special semantics
'publisher',
'uuid', # A UUID usually of type 4
'language', # the primary language of this book
'languages', # ordered list
'publisher', # Simple string, no special semantics
# Absolute path to image file encoded in filesystem_encoding
'cover',
# Of the form (format, data) where format is, for e.g. 'jpeg', 'png', 'gif'...
@ -69,33 +66,63 @@ BOOK_STRUCTURE_FIELDS = frozenset([
])
USER_METADATA_FIELDS = frozenset([
# A dict of a form to be specified
# A dict of dicts similar to field_metadata. Each field description dict
# also contains a value field with the key #value#.
'user_metadata',
])
DEVICE_METADATA_FIELDS = frozenset([
# Ordered list of strings
'device_collections',
'lpath', # Unicode, / separated
# In bytes
'size',
# Mimetype of the book file being represented
'mime',
'device_collections', # Ordered list of strings
'lpath', # Unicode, / separated
'size', # In bytes
'mime', # Mimetype of the book file being represented
])
CALIBRE_METADATA_FIELDS = frozenset([
# An application id
# Semantics to be defined. Is it a db key? a db name + key? A uuid?
'application_id',
'application_id', # An application id, currently set to the db_id.
'db_id', # the calibre primary key of the item.
'formats', # list of formats (extensions) for this book
]
)
ALL_METADATA_FIELDS = SOCIAL_METADATA_FIELDS.union(
PUBLICATION_METADATA_FIELDS).union(
BOOK_STRUCTURE_FIELDS).union(
USER_METADATA_FIELDS).union(
DEVICE_METADATA_FIELDS).union(
CALIBRE_METADATA_FIELDS)
SERIALIZABLE_FIELDS = SOCIAL_METADATA_FIELDS.union(
USER_METADATA_FIELDS).union(
PUBLICATION_METADATA_FIELDS).union(
CALIBRE_METADATA_FIELDS).union(
frozenset(['lpath'])) # I don't think we need device_collections
# All fields except custom fields
STANDARD_METADATA_FIELDS = SOCIAL_METADATA_FIELDS.union(
PUBLICATION_METADATA_FIELDS).union(
BOOK_STRUCTURE_FIELDS).union(
DEVICE_METADATA_FIELDS).union(
CALIBRE_METADATA_FIELDS)
# Serialization of covers/thumbnails will have to be handled carefully, maybe
# as an option to the serializer class
# Metadata fields that smart update must do special processing to copy.
SC_FIELDS_NOT_COPIED = frozenset(['title', 'title_sort', 'authors',
'author_sort', 'author_sort_map',
'cover_data', 'tags', 'language',
'classifiers'])
# Metadata fields that smart update should copy only if the source is not None
SC_FIELDS_COPY_NOT_NULL = frozenset(['lpath', 'size', 'comments', 'thumbnail'])
# Metadata fields that smart update should copy without special handling
SC_COPYABLE_FIELDS = SOCIAL_METADATA_FIELDS.union(
PUBLICATION_METADATA_FIELDS).union(
BOOK_STRUCTURE_FIELDS).union(
DEVICE_METADATA_FIELDS).union(
CALIBRE_METADATA_FIELDS) - \
SC_FIELDS_NOT_COPIED.union(
SC_FIELDS_COPY_NOT_NULL)
SERIALIZABLE_FIELDS = SOCIAL_METADATA_FIELDS.union(
USER_METADATA_FIELDS).union(
PUBLICATION_METADATA_FIELDS).union(
CALIBRE_METADATA_FIELDS).union(
DEVICE_METADATA_FIELDS) - \
frozenset(['device_collections', 'formats',
'cover_data'])
# these are rebuilt when needed

View File

@ -5,9 +5,19 @@ __license__ = 'GPL v3'
__copyright__ = '2010, Kovid Goyal <kovid@kovidgoyal.net>'
__docformat__ = 'restructuredtext en'
import copy
import copy, traceback
from calibre import prints
from calibre.constants import DEBUG
from calibre.ebooks.metadata.book import SC_COPYABLE_FIELDS
from calibre.ebooks.metadata.book import SC_FIELDS_COPY_NOT_NULL
from calibre.ebooks.metadata.book import STANDARD_METADATA_FIELDS
from calibre.ebooks.metadata.book import TOP_LEVEL_CLASSIFIERS
from calibre.ebooks.metadata.book import ALL_METADATA_FIELDS
from calibre.library.field_metadata import FieldMetadata
from calibre.utils.date import isoformat, format_date
from calibre.utils.formatter import TemplateFormatter
from calibre.ebooks.metadata.book import RESERVED_METADATA_FIELDS
NULL_VALUES = {
'user_metadata': {},
@ -19,103 +29,609 @@ NULL_VALUES = {
'author_sort_map': {},
'authors' : [_('Unknown')],
'title' : _('Unknown'),
'language' : 'und'
}
field_metadata = FieldMetadata()
class SafeFormat(TemplateFormatter):
def get_value(self, key, args, kwargs):
try:
if key != 'title_sort':
key = field_metadata.search_term_to_field_key(key.lower())
b = self.book.get_user_metadata(key, False)
if b and b['datatype'] == 'int' and self.book.get(key, 0) == 0:
v = ''
elif b and b['datatype'] == 'float' and b.get(key, 0.0) == 0.0:
v = ''
else:
ign, v = self.book.format_field(key.lower(), series_with_index=False)
if v is None:
return ''
if v == '':
return ''
return v
except:
if DEBUG:
traceback.print_exc()
return key
composite_formatter = SafeFormat()
class Metadata(object):
'''
This class must expose a superset of the API of MetaInformation in terms
of attribute access and methods. Only the __init__ method is different.
MetaInformation will simply become a function that creates and fills in
the attributes of this class.
A class representing all the metadata for a book.
Please keep the method based API of this class to a minimum. Every method
becomes a reserved field name.
'''
def __init__(self):
object.__setattr__(self, '_data', copy.deepcopy(NULL_VALUES))
def __init__(self, title, authors=(_('Unknown'),), other=None):
'''
@param title: title or ``_('Unknown')``
@param authors: List of strings or []
@param other: None or a metadata object
'''
_data = copy.deepcopy(NULL_VALUES)
object.__setattr__(self, '_data', _data)
if other is not None:
self.smart_update(other)
else:
if title:
self.title = title
if authors:
#: List of strings or []
self.author = list(authors) if authors else []# Needed for backward compatibility
self.authors = list(authors) if authors else []
def is_null(self, field):
null_val = NULL_VALUES.get(field, None)
val = getattr(self, field, None)
return not val or val == null_val
def __getattribute__(self, field):
_data = object.__getattribute__(self, '_data')
if field in RESERVED_METADATA_FIELDS:
if field in TOP_LEVEL_CLASSIFIERS:
return _data.get('classifiers').get(field, None)
if field in STANDARD_METADATA_FIELDS:
return _data.get(field, None)
try:
return object.__getattribute__(self, field)
except AttributeError:
pass
if field in _data['user_metadata'].iterkeys():
# TODO: getting user metadata values
pass
d = _data['user_metadata'][field]
val = d['#value#']
if d['datatype'] != 'composite':
return val
if val is None:
d['#value#'] = 'RECURSIVE_COMPOSITE FIELD (Metadata) ' + field
val = d['#value#'] = composite_formatter.safe_format(
d['display']['composite_template'],
self,
_('TEMPLATE ERROR'),
self).strip()
return val
raise AttributeError(
'Metadata object has no attribute named: '+ repr(field))
def __setattr__(self, field, val):
def __setattr__(self, field, val, extra=None):
_data = object.__getattribute__(self, '_data')
if field in RESERVED_METADATA_FIELDS:
if field != 'user_metadata':
if not val:
val = NULL_VALUES[field]
_data[field] = val
else:
raise AttributeError('You cannot set user_metadata directly.')
if field in TOP_LEVEL_CLASSIFIERS:
_data['classifiers'].update({field: val})
elif field in STANDARD_METADATA_FIELDS:
if val is None:
val = NULL_VALUES.get(field, None)
_data[field] = val
elif field in _data['user_metadata'].iterkeys():
# TODO: Setting custom column values
pass
if _data['user_metadata'][field]['datatype'] == 'composite':
_data['user_metadata'][field]['#value#'] = None
else:
_data['user_metadata'][field]['#value#'] = val
_data['user_metadata'][field]['#extra#'] = extra
else:
# You are allowed to stick arbitrary attributes onto this object as
# long as they dont conflict with global or user metadata names
# long as they don't conflict with global or user metadata names
# Don't abuse this privilege
self.__dict__[field] = val
@property
def user_metadata_names(self):
'The set of user metadata names this object knows about'
def __iter__(self):
return object.__getattribute__(self, '_data').iterkeys()
def has_key(self, key):
return key in object.__getattribute__(self, '_data')
def deepcopy(self):
m = Metadata(None)
m.__dict__ = copy.deepcopy(self.__dict__)
object.__setattr__(m, '_data', copy.deepcopy(object.__getattribute__(self, '_data')))
return m
def deepcopy_metadata(self):
m = Metadata(None)
object.__setattr__(m, '_data', copy.deepcopy(object.__getattribute__(self, '_data')))
return m
def get(self, field, default=None):
try:
return self.__getattribute__(field)
except AttributeError:
return default
def get_extra(self, field):
_data = object.__getattribute__(self, '_data')
return frozenset(_data['user_metadata'].iterkeys())
if field in _data['user_metadata'].iterkeys():
return _data['user_metadata'][field]['#extra#']
raise AttributeError(
'Metadata object has no attribute named: '+ repr(field))
# Old MetaInformation API {{{
def copy(self):
pass
def set(self, field, val, extra=None):
self.__setattr__(field, val, extra)
def get_classifiers(self):
'''
Return a copy of the classifiers dictionary.
The dict is small, and the penalty for using a reference where a copy is
needed is large. Also, we don't want any manipulations of the returned
dict to show up in the book.
'''
return copy.deepcopy(object.__getattribute__(self, '_data')['classifiers'])
def set_classifiers(self, classifiers):
object.__getattribute__(self, '_data')['classifiers'] = classifiers
# field-oriented interface. Intended to be the same as in LibraryDatabase
def standard_field_keys(self):
'''
return a list of all possible keys, even if this book doesn't have them
'''
return STANDARD_METADATA_FIELDS
def custom_field_keys(self):
'''
return a list of the custom fields in this book
'''
return object.__getattribute__(self, '_data')['user_metadata'].iterkeys()
def all_field_keys(self):
'''
All field keys known by this instance, even if their value is None
'''
_data = object.__getattribute__(self, '_data')
return frozenset(ALL_METADATA_FIELDS.union(_data['user_metadata'].iterkeys()))
def metadata_for_field(self, key):
'''
return metadata describing a standard or custom field.
'''
if key not in self.custom_field_keys():
return self.get_standard_metadata(key, make_copy=False)
return self.get_user_metadata(key, make_copy=False)
def all_non_none_fields(self):
'''
Return a dictionary containing all non-None metadata fields, including
the custom ones.
'''
result = {}
_data = object.__getattribute__(self, '_data')
for attr in STANDARD_METADATA_FIELDS:
v = _data.get(attr, None)
if v is not None:
result[attr] = v
# separate these because it uses the self.get(), not _data.get()
for attr in TOP_LEVEL_CLASSIFIERS:
v = self.get(attr, None)
if v is not None:
result[attr] = v
for attr in _data['user_metadata'].iterkeys():
v = self.get(attr, None)
if v is not None:
result[attr] = v
if _data['user_metadata'][attr]['datatype'] == 'series':
result[attr+'_index'] = _data['user_metadata'][attr]['#extra#']
return result
# End of field-oriented interface
# Extended interfaces. These permit one to get copies of metadata dictionaries, and to
# get and set custom field metadata
def get_standard_metadata(self, field, make_copy):
'''
return field metadata from the field if it is there. Otherwise return
None. field is the key name, not the label. Return a copy if requested,
just in case the user wants to change values in the dict.
'''
if field in field_metadata and field_metadata[field]['kind'] == 'field':
if make_copy:
return copy.deepcopy(field_metadata[field])
return field_metadata[field]
return None
def get_all_standard_metadata(self, make_copy):
'''
return a dict containing all the standard field metadata associated with
the book.
'''
if not make_copy:
return field_metadata
res = {}
for k in field_metadata:
if field_metadata[k]['kind'] == 'field':
res[k] = copy.deepcopy(field_metadata[k])
return res
def get_all_user_metadata(self, make_copy):
'''
return a dict containing all the custom field metadata associated with
the book.
'''
_data = object.__getattribute__(self, '_data')
user_metadata = _data['user_metadata']
if not make_copy:
return user_metadata
res = {}
for k in user_metadata:
res[k] = copy.deepcopy(user_metadata[k])
return res
def get_user_metadata(self, field, make_copy):
'''
return field metadata from the object if it is there. Otherwise return
None. field is the key name, not the label. Return a copy if requested,
just in case the user wants to change values in the dict.
'''
_data = object.__getattribute__(self, '_data')
_data = _data['user_metadata']
if field in _data:
if make_copy:
return copy.deepcopy(_data[field])
return _data[field]
return None
def set_all_user_metadata(self, metadata):
'''
store custom field metadata into the object. Field is the key name
not the label
'''
if metadata is None:
traceback.print_stack()
else:
for key in metadata:
self.set_user_metadata(key, metadata[key])
def set_user_metadata(self, field, metadata):
'''
store custom field metadata for one column into the object. Field is
the key name not the label
'''
if field is not None:
if not field.startswith('#'):
raise AttributeError(
'Custom field name %s must begin with \'#\''%repr(field))
if metadata is None:
traceback.print_stack()
return
metadata = copy.deepcopy(metadata)
if '#value#' not in metadata:
if metadata['datatype'] == 'text' and metadata['is_multiple']:
metadata['#value#'] = []
else:
metadata['#value#'] = None
_data = object.__getattribute__(self, '_data')
_data['user_metadata'][field] = metadata
def template_to_attribute(self, other, ops):
'''
Takes a list [(src,dest), (src,dest)], evaluates the template in the
context of other, then copies the result to self[dest]. This is on a
best-efforts basis. Some assignments can make no sense.
'''
if not ops:
return
for op in ops:
try:
src = op[0]
dest = op[1]
val = composite_formatter.safe_format\
(src, other, 'PLUGBOARD TEMPLATE ERROR', other)
if dest == 'tags':
self.set(dest, [f.strip() for f in val.split(',') if f.strip()])
elif dest == 'authors':
self.set(dest, [f.strip() for f in val.split('&') if f.strip()])
else:
self.set(dest, val)
except:
if DEBUG:
traceback.print_exc()
# Old Metadata API {{{
def print_all_attributes(self):
pass
for x in STANDARD_METADATA_FIELDS:
prints('%s:'%x, getattr(self, x, 'None'))
for x in self.custom_field_keys():
meta = self.get_user_metadata(x, make_copy=False)
if meta is not None:
prints(x, meta)
prints('--------------')
def smart_update(self, other, replace_metadata=False):
pass
'''
Merge the information in `other` into self. In case of conflicts, the information
in `other` takes precedence, unless the information in `other` is NULL.
'''
def copy_not_none(dest, src, attr):
v = getattr(src, attr, None)
if v not in (None, NULL_VALUES.get(attr, None)):
setattr(dest, attr, copy.deepcopy(v))
def format_series_index(self):
pass
if other.title and other.title != _('Unknown'):
self.title = other.title
if hasattr(other, 'title_sort'):
self.title_sort = other.title_sort
if other.authors and other.authors[0] != _('Unknown'):
self.authors = list(other.authors)
if hasattr(other, 'author_sort_map'):
self.author_sort_map = dict(other.author_sort_map)
if hasattr(other, 'author_sort'):
self.author_sort = other.author_sort
if replace_metadata:
# SPECIAL_FIELDS = frozenset(['lpath', 'size', 'comments', 'thumbnail'])
for attr in SC_COPYABLE_FIELDS:
setattr(self, attr, getattr(other, attr, 1.0 if \
attr == 'series_index' else None))
self.tags = other.tags
self.cover_data = getattr(other, 'cover_data',
NULL_VALUES['cover_data'])
self.set_all_user_metadata(other.get_all_user_metadata(make_copy=True))
for x in SC_FIELDS_COPY_NOT_NULL:
copy_not_none(self, other, x)
if callable(getattr(other, 'get_classifiers', None)):
self.set_classifiers(other.get_classifiers())
# language is handled below
else:
for attr in SC_COPYABLE_FIELDS:
copy_not_none(self, other, attr)
for x in SC_FIELDS_COPY_NOT_NULL:
copy_not_none(self, other, x)
if other.tags:
# Case-insensitive but case preserving merging
lotags = [t.lower() for t in other.tags]
lstags = [t.lower() for t in self.tags]
ot, st = map(frozenset, (lotags, lstags))
for t in st.intersection(ot):
sidx = lstags.index(t)
oidx = lotags.index(t)
self.tags[sidx] = other.tags[oidx]
self.tags += [t for t in other.tags if t.lower() in ot-st]
if getattr(other, 'cover_data', False):
other_cover = other.cover_data[-1]
self_cover = self.cover_data[-1] if self.cover_data else ''
if not self_cover: self_cover = ''
if not other_cover: other_cover = ''
if len(other_cover) > len(self_cover):
self.cover_data = other.cover_data
if callable(getattr(other, 'custom_field_keys', None)):
for x in other.custom_field_keys():
meta = other.get_user_metadata(x, make_copy=True)
if meta is not None:
self_tags = self.get(x, [])
self.set_user_metadata(x, meta) # get... did the deepcopy
other_tags = other.get(x, [])
if meta['is_multiple']:
# Case-insensitive but case preserving merging
lotags = [t.lower() for t in other_tags]
lstags = [t.lower() for t in self_tags]
ot, st = map(frozenset, (lotags, lstags))
for t in st.intersection(ot):
sidx = lstags.index(t)
oidx = lotags.index(t)
self_tags[sidx] = other.tags[oidx]
self_tags += [t for t in other.tags if t.lower() in ot-st]
setattr(self, x, self_tags)
my_comments = getattr(self, 'comments', '')
other_comments = getattr(other, 'comments', '')
if not my_comments:
my_comments = ''
if not other_comments:
other_comments = ''
if len(other_comments.strip()) > len(my_comments.strip()):
self.comments = other_comments
# Copy all the non-none classifiers
if callable(getattr(other, 'get_classifiers', None)):
d = self.get_classifiers()
s = other.get_classifiers()
d.update([v for v in s.iteritems() if v[1] is not None])
self.set_classifiers(d)
else:
# other structure not Metadata. Copy the top-level classifiers
for attr in TOP_LEVEL_CLASSIFIERS:
copy_not_none(self, other, attr)
other_lang = getattr(other, 'language', None)
if other_lang and other_lang.lower() != 'und':
self.language = other_lang
def format_series_index(self, val=None):
from calibre.ebooks.metadata import fmt_sidx
v = self.series_index if val is None else val
try:
x = float(v)
except (ValueError, TypeError):
x = 1
return fmt_sidx(x)
def authors_from_string(self, raw):
pass
from calibre.ebooks.metadata import string_to_authors
self.authors = string_to_authors(raw)
def format_authors(self):
pass
from calibre.ebooks.metadata import authors_to_string
return authors_to_string(self.authors)
def format_tags(self):
pass
return u', '.join([unicode(t) for t in self.tags])
def format_rating(self):
return unicode(self.rating)
def format_field(self, key, series_with_index=True):
name, val, ign, ign = self.format_field_extended(key, series_with_index)
return (name, val)
def format_field_extended(self, key, series_with_index=True):
from calibre.ebooks.metadata import authors_to_string
'''
returns the tuple (field_name, formatted_value)
'''
# Handle custom series index
if key.startswith('#') and key.endswith('_index'):
tkey = key[:-6] # strip the _index
cmeta = self.get_user_metadata(tkey, make_copy=False)
if cmeta['datatype'] == 'series':
if self.get(tkey):
res = self.get_extra(tkey)
return (unicode(cmeta['name']+'_index'),
self.format_series_index(res), res, cmeta)
else:
return (unicode(cmeta['name']+'_index'), '', '', cmeta)
if key in self.custom_field_keys():
res = self.get(key, None)
cmeta = self.get_user_metadata(key, make_copy=False)
name = unicode(cmeta['name'])
if cmeta['datatype'] != 'composite' and (res is None or res == ''):
return (name, res, None, None)
orig_res = res
cmeta = self.get_user_metadata(key, make_copy=False)
if res is None or res == '':
return (name, res, None, None)
orig_res = res
datatype = cmeta['datatype']
if datatype == 'text' and cmeta['is_multiple']:
res = u', '.join(res)
elif datatype == 'series' and series_with_index:
if self.get_extra(key) is not None:
res = res + \
' [%s]'%self.format_series_index(val=self.get_extra(key))
elif datatype == 'datetime':
res = format_date(res, cmeta['display'].get('date_format','dd MMM yyyy'))
elif datatype == 'bool':
res = _('Yes') if res else _('No')
return (name, unicode(res), orig_res, cmeta)
# Translate aliases into the standard field name
fmkey = field_metadata.search_term_to_field_key(key)
if fmkey in field_metadata and field_metadata[fmkey]['kind'] == 'field':
res = self.get(key, None)
fmeta = field_metadata[fmkey]
name = unicode(fmeta['name'])
if res is None or res == '':
return (name, res, None, None)
orig_res = res
name = unicode(fmeta['name'])
datatype = fmeta['datatype']
if key == 'authors':
res = authors_to_string(res)
elif key == 'series_index':
res = self.format_series_index(res)
elif datatype == 'text' and fmeta['is_multiple']:
res = u', '.join(res)
elif datatype == 'series' and series_with_index:
res = res + ' [%s]'%self.format_series_index()
elif datatype == 'datetime':
res = format_date(res, fmeta['display'].get('date_format','dd MMM yyyy'))
return (name, unicode(res), orig_res, fmeta)
return (None, None, None, None)
def __unicode__(self):
pass
from calibre.ebooks.metadata import authors_to_string
ans = []
def fmt(x, y):
ans.append(u'%-20s: %s'%(unicode(x), unicode(y)))
fmt('Title', self.title)
if self.title_sort:
fmt('Title sort', self.title_sort)
if self.authors:
fmt('Author(s)', authors_to_string(self.authors) + \
((' [' + self.author_sort + ']') if self.author_sort else ''))
if self.publisher:
fmt('Publisher', self.publisher)
if getattr(self, 'book_producer', False):
fmt('Book Producer', self.book_producer)
if self.comments:
fmt('Comments', self.comments)
if self.isbn:
fmt('ISBN', self.isbn)
if self.tags:
fmt('Tags', u', '.join([unicode(t) for t in self.tags]))
if self.series:
fmt('Series', self.series + ' #%s'%self.format_series_index())
if self.language:
fmt('Language', self.language)
if self.rating is not None:
fmt('Rating', self.rating)
if self.timestamp is not None:
fmt('Timestamp', isoformat(self.timestamp))
if self.pubdate is not None:
fmt('Published', isoformat(self.pubdate))
if self.rights is not None:
fmt('Rights', unicode(self.rights))
for key in self.custom_field_keys():
val = self.get(key, None)
if val:
(name, val) = self.format_field(key)
fmt(name, unicode(val))
return u'\n'.join(ans)
def to_html(self):
pass
from calibre.ebooks.metadata import authors_to_string
ans = [(_('Title'), unicode(self.title))]
ans += [(_('Author(s)'), (authors_to_string(self.authors) if self.authors else _('Unknown')))]
ans += [(_('Publisher'), unicode(self.publisher))]
ans += [(_('Producer'), unicode(self.book_producer))]
ans += [(_('Comments'), unicode(self.comments))]
ans += [('ISBN', unicode(self.isbn))]
ans += [(_('Tags'), u', '.join([unicode(t) for t in self.tags]))]
if self.series:
ans += [(_('Series'), unicode(self.series)+ ' #%s'%self.format_series_index())]
ans += [(_('Language'), unicode(self.language))]
if self.timestamp is not None:
ans += [(_('Timestamp'), unicode(self.timestamp.isoformat(' ')))]
if self.pubdate is not None:
ans += [(_('Published'), unicode(self.pubdate.isoformat(' ')))]
if self.rights is not None:
ans += [(_('Rights'), unicode(self.rights))]
for key in self.custom_field_keys():
val = self.get(key, None)
if val:
(name, val) = self.format_field(key)
ans += [(name, val)]
for i, x in enumerate(ans):
ans[i] = u'<tr><td><b>%s</b></td><td>%s</td></tr>'%x
return u'<table>%s</table>'%u'\n'.join(ans)
def __str__(self):
return self.__unicode__().encode('utf-8')
def __nonzero__(self):
return True
return bool(self.title or self.author or self.comments or self.tags)
# }}}
# We don't need reserved field names for this object any more. Lets just use a
# protocol like the last char of a user field label should be _ when using this
# object
# So mi.tags returns the builtin tags and mi.tags_ returns the user tags

View File

@ -0,0 +1,143 @@
'''
Created on 4 Jun 2010
@author: charles
'''
from base64 import b64encode, b64decode
import json
import traceback
from calibre.ebooks.metadata.book import SERIALIZABLE_FIELDS
from calibre.constants import filesystem_encoding, preferred_encoding
from calibre.library.field_metadata import FieldMetadata
from calibre.utils.date import parse_date, isoformat, UNDEFINED_DATE
from calibre.utils.magick import Image
from calibre import isbytestring
# Translate datetimes to and from strings. The string form is the datetime in
# UTC. The returned date is also UTC
def string_to_datetime(src):
if src == "None":
return None
return parse_date(src)
def datetime_to_string(dateval):
if dateval is None or dateval == UNDEFINED_DATE:
return "None"
return isoformat(dateval)
def encode_thumbnail(thumbnail):
'''
Encode the image part of a thumbnail, then return the 3 part tuple
'''
if thumbnail is None:
return None
if not isinstance(thumbnail, (tuple, list)):
try:
img = Image()
img.load(thumbnail)
width, height = img.size
thumbnail = (width, height, thumbnail)
except:
return None
return (thumbnail[0], thumbnail[1], b64encode(str(thumbnail[2])))
def decode_thumbnail(tup):
'''
Decode an encoded thumbnail into its 3 component parts
'''
if tup is None:
return None
return (tup[0], tup[1], b64decode(tup[2]))
def object_to_unicode(obj, enc=preferred_encoding):
def dec(x):
return x.decode(enc, 'replace')
if isbytestring(obj):
return dec(obj)
if isinstance(obj, (list, tuple)):
return [dec(x) if isbytestring(x) else x for x in obj]
if isinstance(obj, dict):
ans = {}
for k, v in obj.items():
k = object_to_unicode(k)
v = object_to_unicode(v)
ans[k] = v
return ans
return obj
class JsonCodec(object):
def __init__(self):
self.field_metadata = FieldMetadata()
def encode_to_file(self, file, booklist):
file.write(json.dumps(self.encode_booklist_metadata(booklist),
indent=2, encoding='utf-8'))
def encode_booklist_metadata(self, booklist):
result = []
for book in booklist:
result.append(self.encode_book_metadata(book))
return result
def encode_book_metadata(self, book):
result = {}
for key in SERIALIZABLE_FIELDS:
result[key] = self.encode_metadata_attr(book, key)
return result
def encode_metadata_attr(self, book, key):
if key == 'user_metadata':
meta = book.get_all_user_metadata(make_copy=True)
for k in meta:
if meta[k]['datatype'] == 'datetime':
meta[k]['#value#'] = datetime_to_string(meta[k]['#value#'])
return meta
if key in self.field_metadata:
datatype = self.field_metadata[key]['datatype']
else:
datatype = None
value = book.get(key)
if key == 'thumbnail':
return encode_thumbnail(value)
elif isbytestring(value): # str includes bytes
enc = filesystem_encoding if key == 'lpath' else preferred_encoding
return object_to_unicode(value, enc=enc)
elif datatype == 'datetime':
return datetime_to_string(value)
else:
return object_to_unicode(value)
def decode_from_file(self, file, booklist, book_class, prefix):
js = []
try:
js = json.load(file, encoding='utf-8')
for item in js:
book = book_class(prefix, item.get('lpath', None))
for key in item.keys():
meta = self.decode_metadata(key, item[key])
if key == 'user_metadata':
book.set_all_user_metadata(meta)
else:
setattr(book, key, meta)
booklist.append(book)
except:
print 'exception during JSON decoding'
traceback.print_exc()
def decode_metadata(self, key, value):
if key == 'user_metadata':
for k in value:
if value[k]['datatype'] == 'datetime':
value[k]['#value#'] = string_to_datetime(value[k]['#value#'])
return value
elif key in self.field_metadata:
if self.field_metadata[key]['datatype'] == 'datetime':
return string_to_datetime(value)
if key == 'thumbnail':
return decode_thumbnail(value)
return value

View File

@ -109,7 +109,7 @@ def do_set_metadata(opts, mi, stream, stream_type):
from_opf = getattr(opts, 'from_opf', None)
if from_opf is not None:
from calibre.ebooks.metadata.opf2 import OPF
opf_mi = MetaInformation(OPF(open(from_opf, 'rb')))
opf_mi = OPF(open(from_opf, 'rb')).to_book_metadata()
mi.smart_update(opf_mi)
for pref in config().option_set.preferences:

View File

@ -164,10 +164,10 @@ def get_cover(opf, opf_path, stream, reader=None):
return render_html_svg_workaround(cpage, default_log)
def get_metadata(stream, extract_cover=True):
""" Return metadata as a :class:`MetaInformation` object """
""" Return metadata as a :class:`Metadata` object """
stream.seek(0)
reader = OCFZipReader(stream)
mi = MetaInformation(reader.opf)
mi = reader.opf.to_book_metadata()
if extract_cover:
try:
cdata = get_cover(reader.opf, reader.opf_path, stream, reader=reader)

View File

@ -33,7 +33,10 @@ def get_metadata(stream):
le = XPath('descendant::fb2:last-name')(au)
if le:
lname = tostring(le[0])
author += ' '+lname
if author:
author += ' '+lname
else:
author = lname
if author:
authors.append(author)
if len(authors) == 1 and author is not None:

View File

@ -29,7 +29,7 @@ class MetadataSource(Plugin): # {{{
future use.
The fetch method must store the results in `self.results` as a list of
:class:`MetaInformation` objects. If there is an error, it should be stored
:class:`Metadata` objects. If there is an error, it should be stored
in `self.exception` and `self.tb` (for the traceback).
'''

View File

@ -8,7 +8,7 @@ import sys, re
from urllib import quote
from calibre.utils.config import OptionParser
from calibre.ebooks.metadata import MetaInformation
from calibre.ebooks.metadata.book.base import Metadata
from calibre.ebooks.BeautifulSoup import BeautifulStoneSoup
from calibre import browser
@ -42,34 +42,48 @@ def fetch_metadata(url, max=100, timeout=5.):
return books
class ISBNDBMetadata(MetaInformation):
class ISBNDBMetadata(Metadata):
def __init__(self, book):
MetaInformation.__init__(self, None, [])
Metadata.__init__(self, None, [])
self.isbn = book.get('isbn13', book.get('isbn'))
self.title = book.find('titlelong').string
def tostring(e):
if not hasattr(e, 'string'):
return None
ans = e.string
if ans is not None:
ans = unicode(ans).strip()
if not ans:
ans = None
return ans
self.isbn = unicode(book.get('isbn13', book.get('isbn')))
self.title = tostring(book.find('titlelong'))
if not self.title:
self.title = book.find('title').string
self.title = tostring(book.find('title'))
if not self.title:
self.title = _('Unknown')
self.title = unicode(self.title).strip()
au = unicode(book.find('authorstext').string).strip()
temp = au.split(',')
self.authors = []
for au in temp:
if not au: continue
self.authors.extend([a.strip() for a in au.split('&amp;')])
au = tostring(book.find('authorstext'))
if au:
au = au.strip()
temp = au.split(',')
for au in temp:
if not au: continue
self.authors.extend([a.strip() for a in au.split('&amp;')])
try:
self.author_sort = book.find('authors').find('person').string
self.author_sort = tostring(book.find('authors').find('person'))
if self.authors and self.author_sort == self.authors[0]:
self.author_sort = None
except:
pass
self.publisher = book.find('publishertext').string
self.publisher = tostring(book.find('publishertext'))
summ = book.find('summary')
if summ and hasattr(summ, 'string') and summ.string:
self.comments = 'SUMMARY:\n'+summ.string
summ = tostring(book.find('summary'))
if summ:
self.comments = 'SUMMARY:\n'+summ
def build_isbn(base_url, opts):

View File

@ -12,6 +12,7 @@ import mechanize
from calibre import browser, prints
from calibre.utils.config import OptionParser
from calibre.ebooks.BeautifulSoup import BeautifulSoup
from calibre.ebooks.chardet import strip_encoding_declarations
OPENLIBRARY = 'http://covers.openlibrary.org/b/isbn/%s-L.jpg?default=false'
@ -110,6 +111,8 @@ def get_social_metadata(title, authors, publisher, isbn, username=None,
+isbn).read()
if not raw:
return mi
raw = raw.decode('utf-8', 'replace')
raw = strip_encoding_declarations(raw)
root = html.fromstring(raw)
h1 = root.xpath('//div[@class="headsummary"]/h1')
if h1 and not mi.title:

View File

@ -6,7 +6,6 @@ Support for reading the metadata from a LIT file.
import cStringIO, os
from calibre.ebooks.metadata import MetaInformation
from calibre.ebooks.metadata.opf2 import OPF
def get_metadata(stream):
@ -16,7 +15,7 @@ def get_metadata(stream):
src = litfile.get_metadata().encode('utf-8')
litfile = litfile._litfile
opf = OPF(cStringIO.StringIO(src), os.getcwd())
mi = MetaInformation(opf)
mi = opf.to_book_metadata()
covers = []
for item in opf.iterguide():
if 'cover' not in item.get('type', '').lower():

View File

@ -108,7 +108,8 @@ def _get_metadata(stream, stream_type, use_libprs_metadata,
base = metadata_from_filename(name, pat=pattern)
if force_read_metadata or is_recipe(name) or prefs['read_file_metadata']:
mi = get_file_type_metadata(stream, stream_type)
if base.title == os.path.splitext(name)[0] and base.authors is None:
if base.title == os.path.splitext(name)[0] and \
base.is_null('authors') and base.is_null('isbn'):
# Assume that there was no metadata in the file and the user set pattern
# to match meta info from the file name did not match.
# The regex is meant to match the standard format filenames are written
@ -181,7 +182,7 @@ def metadata_from_filename(name, pat=None):
mi.isbn = si
except (IndexError, ValueError):
pass
if not mi.title:
if mi.is_null('title'):
mi.title = name
return mi
@ -194,7 +195,7 @@ def opf_metadata(opfpath):
try:
opf = OPF(f, os.path.dirname(opfpath))
if opf.application_id is not None:
mi = MetaInformation(opf)
mi = opf.to_book_metadata()
if hasattr(opf, 'cover') and opf.cover:
cpath = os.path.join(os.path.dirname(opfpath), opf.cover)
if os.access(cpath, os.R_OK):

View File

@ -404,14 +404,16 @@ class MetadataUpdater(object):
if self.cover_record is not None:
size = len(self.cover_record)
cover = rescale_image(data, size)
cover += '\0' * (size - len(cover))
self.cover_record[:] = cover
if len(cover) <= size:
cover += '\0' * (size - len(cover))
self.cover_record[:] = cover
if self.thumbnail_record is not None:
size = len(self.thumbnail_record)
thumbnail = rescale_image(data, size, dimen=MAX_THUMB_DIMEN)
thumbnail += '\0' * (size - len(thumbnail))
self.thumbnail_record[:] = thumbnail
return
if len(thumbnail) <= size:
thumbnail += '\0' * (size - len(thumbnail))
self.thumbnail_record[:] = thumbnail
return
def set_metadata(stream, mi):
mu = MetadataUpdater(stream)

View File

@ -7,7 +7,7 @@ __docformat__ = 'restructuredtext en'
lxml based OPF parser.
'''
import re, sys, unittest, functools, os, mimetypes, uuid, glob, cStringIO
import re, sys, unittest, functools, os, mimetypes, uuid, glob, cStringIO, json
from urllib import unquote
from urlparse import urlparse
@ -16,11 +16,13 @@ from lxml import etree
from calibre.ebooks.chardet import xml_to_unicode
from calibre.constants import __appname__, __version__, filesystem_encoding
from calibre.ebooks.metadata.toc import TOC
from calibre.ebooks.metadata import MetaInformation, string_to_authors
from calibre.ebooks.metadata import string_to_authors, MetaInformation
from calibre.ebooks.metadata.book.base import Metadata
from calibre.utils.date import parse_date, isoformat
from calibre.utils.localization import get_lang
from calibre import prints
class Resource(object):
class Resource(object): # {{{
'''
Represents a resource (usually a file on the filesystem or a URL pointing
to the web. Such resources are commonly referred to in OPF files.
@ -101,8 +103,9 @@ class Resource(object):
def __repr__(self):
return 'Resource(%s, %s)'%(repr(self.path), repr(self.href()))
# }}}
class ResourceCollection(object):
class ResourceCollection(object): # {{{
def __init__(self):
self._resources = []
@ -153,10 +156,9 @@ class ResourceCollection(object):
for res in self:
res.set_basedir(path)
# }}}
class ManifestItem(Resource):
class ManifestItem(Resource): # {{{
@staticmethod
def from_opf_manifest_item(item, basedir):
@ -194,8 +196,9 @@ class ManifestItem(Resource):
return self.media_type
raise IndexError('%d out of bounds.'%index)
# }}}
class Manifest(ResourceCollection):
class Manifest(ResourceCollection): # {{{
@staticmethod
def from_opf_manifest_element(items, dir):
@ -262,7 +265,9 @@ class Manifest(ResourceCollection):
if i.id == id:
return i.mime_type
class Spine(ResourceCollection):
# }}}
class Spine(ResourceCollection): # {{{
class Item(Resource):
@ -334,7 +339,9 @@ class Spine(ResourceCollection):
for i in self:
yield i.path
class Guide(ResourceCollection):
# }}}
class Guide(ResourceCollection): # {{{
class Reference(Resource):
@ -371,6 +378,7 @@ class Guide(ResourceCollection):
self[-1].type = type
self[-1].title = ''
# }}}
class MetadataField(object):
@ -412,7 +420,29 @@ class MetadataField(object):
elem = obj.create_metadata_element(self.name, is_dc=self.is_dc)
obj.set_text(elem, unicode(val))
class OPF(object):
def serialize_user_metadata(metadata_elem, all_user_metadata, tail='\n'+(' '*8)):
from calibre.utils.config import to_json
from calibre.ebooks.metadata.book.json_codec import object_to_unicode
for name, fm in all_user_metadata.items():
try:
fm = object_to_unicode(fm)
fm = json.dumps(fm, default=to_json, ensure_ascii=False)
except:
prints('Failed to write user metadata:', name)
import traceback
traceback.print_exc()
continue
meta = metadata_elem.makeelement('meta')
meta.set('name', 'calibre:user_metadata:'+name)
meta.set('content', fm)
meta.tail = tail
metadata_elem.append(meta)
class OPF(object): # {{{
MIMETYPE = 'application/oebps-package+xml'
PARSER = etree.XMLParser(recover=True)
NAMESPACES = {
@ -497,6 +527,43 @@ class OPF(object):
self.guide = Guide.from_opf_guide(guide, basedir) if guide else None
self.cover_data = (None, None)
self.find_toc()
self.read_user_metadata()
def read_user_metadata(self):
self._user_metadata_ = {}
temp = Metadata('x', ['x'])
from calibre.utils.config import from_json
elems = self.root.xpath('//*[name() = "meta" and starts-with(@name,'
'"calibre:user_metadata:") and @content]')
for elem in elems:
name = elem.get('name')
name = ':'.join(name.split(':')[2:])
if not name or not name.startswith('#'):
continue
fm = elem.get('content')
try:
fm = json.loads(fm, object_hook=from_json)
temp.set_user_metadata(name, fm)
except:
prints('Failed to read user metadata:', name)
import traceback
traceback.print_exc()
continue
self._user_metadata_ = temp.get_all_user_metadata(True)
def to_book_metadata(self):
ans = MetaInformation(self)
for n, v in self._user_metadata_.items():
ans.set_user_metadata(n, v)
return ans
def write_user_metadata(self):
elems = self.root.xpath('//*[name() = "meta" and starts-with(@name,'
'"calibre:user_metadata:") and @content]')
for elem in elems:
elem.getparent().remove(elem)
serialize_user_metadata(self.metadata,
self._user_metadata_)
def find_toc(self):
self.toc = None
@ -911,6 +978,7 @@ class OPF(object):
return elem
def render(self, encoding='utf-8'):
self.write_user_metadata()
raw = etree.tostring(self.root, encoding=encoding, pretty_print=True)
if not raw.lstrip().startswith('<?xml '):
raw = '<?xml version="1.0" encoding="%s"?>\n'%encoding.upper()+raw
@ -924,18 +992,22 @@ class OPF(object):
val = getattr(mi, attr, None)
if val is not None and val != [] and val != (None, None):
setattr(self, attr, val)
temp = self.to_book_metadata()
temp.smart_update(mi, replace_metadata=replace_metadata)
self._user_metadata_ = temp.get_all_user_metadata(True)
# }}}
class OPFCreator(MetaInformation):
class OPFCreator(Metadata):
def __init__(self, base_path, *args, **kwargs):
def __init__(self, base_path, other):
'''
Initialize.
@param base_path: An absolute path to the directory in which this OPF file
will eventually be. This is used by the L{create_manifest} method
to convert paths to files into relative paths.
'''
MetaInformation.__init__(self, *args, **kwargs)
Metadata.__init__(self, title='', other=other)
self.base_path = os.path.abspath(base_path)
if self.application_id is None:
self.application_id = str(uuid.uuid4())
@ -1115,6 +1187,8 @@ class OPFCreator(MetaInformation):
item.set('title', ref.title)
guide.append(item)
serialize_user_metadata(metadata, self.get_all_user_metadata(False))
root = E.package(
metadata,
manifest,
@ -1156,7 +1230,7 @@ def metadata_to_opf(mi, as_string=True):
<metadata xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:opf="http://www.idpf.org/2007/opf">
<dc:identifier opf:scheme="%(a)s" id="%(a)s_id">%(id)s</dc:identifier>
<dc:identifier opf:scheme="uuid" id="uuid_id">%(uuid)s</dc:identifier>
</metadata>
</metadata>
<guide/>
</package>
'''%dict(a=__appname__, id=mi.application_id, uuid=mi.uuid)))
@ -1188,7 +1262,7 @@ def metadata_to_opf(mi, as_string=True):
factory(DC('contributor'), mi.book_producer, __appname__, 'bkp')
if hasattr(mi.pubdate, 'isoformat'):
factory(DC('date'), isoformat(mi.pubdate))
if mi.category:
if hasattr(mi, 'category') and mi.category:
factory(DC('type'), mi.category)
if mi.comments:
factory(DC('description'), mi.comments)
@ -1217,6 +1291,8 @@ def metadata_to_opf(mi, as_string=True):
if mi.title_sort:
meta('title_sort', mi.title_sort)
serialize_user_metadata(metadata, mi.get_all_user_metadata(False))
metadata[-1].tail = '\n' +(' '*4)
if mi.cover:
@ -1334,5 +1410,30 @@ def suite():
def test():
unittest.TextTestRunner(verbosity=2).run(suite())
def test_user_metadata():
from cStringIO import StringIO
mi = Metadata('Test title', ['test author1', 'test author2'])
um = {
'#myseries': { '#value#': u'test series\xe4', 'datatype':'text',
'is_multiple': None, 'name': u'My Series'},
'#myseries_index': { '#value#': 2.45, 'datatype': 'float',
'is_multiple': None},
'#mytags': {'#value#':['t1','t2','t3'], 'datatype':'text',
'is_multiple': '|', 'name': u'My Tags'}
}
mi.set_all_user_metadata(um)
raw = metadata_to_opf(mi)
opfc = OPFCreator(os.getcwd(), other=mi)
out = StringIO()
opfc.render(out)
raw2 = out.getvalue()
f = StringIO(raw)
opf = OPF(f)
f2 = StringIO(raw2)
opf2 = OPF(f2)
assert um == opf._user_metadata_
assert um == opf2._user_metadata_
print opf.render()
if __name__ == '__main__':
test()
test_user_metadata()

View File

@ -125,7 +125,7 @@ def create_metadata(stream, options):
au = u', '.join(au)
author = au.encode('ascii', 'ignore')
md += r'{\author %s}'%(author,)
if options.category:
if options.get('category', None):
category = options.category.encode('ascii', 'ignore')
md += r'{\category %s}'%(category,)
comp = options.comment if hasattr(options, 'comment') else options.comments
@ -180,7 +180,7 @@ def set_metadata(stream, options):
src = pat.sub(r'{\\author ' + author + r'}', src)
else:
src = add_metadata_item(src, 'author', author)
category = options.category
category = options.get('category', None)
if category != None:
category = category.encode('ascii', 'replace')
pat = re.compile(base_pat.replace('name', 'category'), re.DOTALL)

View File

@ -184,13 +184,14 @@ class MobiMLizer(object):
elif tag in NESTABLE_TAGS and istate.rendered:
para = wrapper = bstate.nested[-1]
elif left > 0 and indent >= 0:
ems = self.profile.mobi_ems_per_blockquote
para = wrapper = etree.SubElement(parent, XHTML('blockquote'))
para = wrapper
emleft = int(round(left / self.profile.fbase)) - 1
emleft = int(round(left / self.profile.fbase)) - ems
emleft = min((emleft, 10))
while emleft > 0:
while emleft > ems/2.0:
para = etree.SubElement(para, XHTML('blockquote'))
emleft -= 1
emleft -= ems
else:
para = wrapper = etree.SubElement(parent, XHTML('p'))
bstate.inline = bstate.para = para

View File

@ -234,7 +234,7 @@ class MobiReader(object):
self.debug = debug
self.embedded_mi = None
self.base_css_rules = textwrap.dedent('''
blockquote { margin: 0em 0em 0em 1.25em; text-align: justify }
blockquote { margin: 0em 0em 0em 2em; text-align: justify }
p { margin: 0em; text-align: justify }
@ -441,7 +441,7 @@ class MobiReader(object):
html.tostring(elem, encoding='utf-8') + '</package>'
stream = cStringIO.StringIO(raw)
opf = OPF(stream)
self.embedded_mi = MetaInformation(opf)
self.embedded_mi = opf.to_book_metadata()
if guide is not None:
for ref in guide.xpath('descendant::reference'):
if 'cover' in ref.get('type', '').lower():

View File

@ -15,7 +15,6 @@ from struct import pack
import time
from urlparse import urldefrag
from PIL import Image
from cStringIO import StringIO
from calibre.ebooks.mobi.langcodes import iana2mobi
from calibre.ebooks.mobi.mobiml import MBP_NS
@ -28,6 +27,7 @@ from calibre.ebooks.oeb.base import namespace
from calibre.ebooks.oeb.base import prefixname
from calibre.ebooks.oeb.base import urlnormalize
from calibre.ebooks.compression.palmdoc import compress_doc
from calibre.utils.magick.draw import Image, save_cover_data_to, thumbnail
INDEXING = True
FCIS_FLIS = True
@ -111,46 +111,35 @@ def align_block(raw, multiple=4, pad='\0'):
return raw + pad*(multiple - extra)
def rescale_image(data, maxsizeb, dimen=None):
image = Image.open(StringIO(data))
format = image.format
changed = False
if image.format not in ('JPEG', 'GIF'):
width, height = image.size
area = width * height
if area <= 40000:
format = 'GIF'
else:
image = image.convert('RGBA')
format = 'JPEG'
changed = True
if dimen is not None:
image.thumbnail(dimen, Image.ANTIALIAS)
changed = True
if changed:
data = StringIO()
image.save(data, format)
data = data.getvalue()
data = thumbnail(data, width=dimen, height=dimen)[-1]
else:
# Replace transparent pixels with white pixels and convert to JPEG
data = save_cover_data_to(data, 'img.jpg', return_data=True)
if len(data) <= maxsizeb:
return data
image = image.convert('RGBA')
for quality in xrange(95, -1, -1):
data = StringIO()
image.save(data, 'JPEG', quality=quality)
data = data.getvalue()
if len(data) <= maxsizeb:
return data
width, height = image.size
for scale in xrange(99, 0, -1):
scale = scale / 100.
data = StringIO()
scaled = image.copy()
size = (int(width * scale), (height * scale))
scaled.thumbnail(size, Image.ANTIALIAS)
scaled.save(data, 'JPEG', quality=0)
data = data.getvalue()
if len(data) <= maxsizeb:
return data
# Well, we tried?
orig_data = data
img = Image()
quality = 95
img.load(data)
while len(data) >= maxsizeb and quality >= 10:
quality -= 5
img.set_compression_quality(quality)
data = img.export('jpg')
if len(data) <= maxsizeb:
return data
orig_data = data
scale = 0.9
while len(data) >= maxsizeb and scale >= 0.05:
img = Image()
img.load(orig_data)
w, h = img.size
img.size = (int(scale*w), int(scale*h))
img.set_compression_quality(quality)
data = img.export('jpg')
scale -= 0.05
return data
class Serializer(object):
@ -1796,12 +1785,13 @@ class MobiWriter(object):
self._oeb.log.debug('Index records dumped to', t)
def _clean_text_value(self, text):
if not text:
text = u'(none)'
text = text.strip()
if not isinstance(text, unicode):
text = text.decode('utf-8', 'replace')
text = text.encode('ascii','replace')
if text is not None and text.strip() :
text = text.strip()
if not isinstance(text, unicode):
text = text.decode('utf-8', 'replace')
text = text.encode('utf-8')
else :
text = "(none)".encode('utf-8')
return text
def _add_to_ctoc(self, ctoc_str, record_offset):

View File

@ -654,8 +654,6 @@ class Metadata(object):
if predicate(x):
l.remove(x)
def __getitem__(self, key):
return self.items[key]

View File

@ -126,24 +126,29 @@ class OEBReader(object):
def _metadata_from_opf(self, opf):
from calibre.ebooks.metadata.opf2 import OPF
from calibre.ebooks.metadata import MetaInformation
from calibre.ebooks.oeb.transforms.metadata import meta_info_to_oeb_metadata
stream = cStringIO.StringIO(etree.tostring(opf))
mi = MetaInformation(OPF(stream))
mi = OPF(stream).to_book_metadata()
if not mi.language:
mi.language = get_lang().replace('_', '-')
self.oeb.metadata.add('language', mi.language)
if not mi.title:
mi.title = self.oeb.translate(__('Unknown'))
if not mi.authors:
mi.authors = [self.oeb.translate(__('Unknown'))]
if not mi.book_producer:
mi.book_producer = '%(a)s (%(v)s) [http://%(a)s.kovidgoyal.net]'%\
mi.book_producer = '%(a)s (%(v)s) [http://%(a)s-ebook.com]'%\
dict(a=__appname__, v=__version__)
meta_info_to_oeb_metadata(mi, self.oeb.metadata, self.logger)
self.oeb.metadata.add('identifier', str(uuid.uuid4()), id='uuid_id',
scheme='uuid')
m = self.oeb.metadata
m.add('identifier', str(uuid.uuid4()), id='uuid_id', scheme='uuid')
self.oeb.uid = self.oeb.metadata.identifier[-1]
if not m.title:
m.add('title', self.oeb.translate(__('Unknown')))
has_aut = False
for x in m.creator:
if getattr(x, 'role', '').lower() in ('', 'aut'):
has_aut = True
break
if not has_aut:
m.add('creator', self.oeb.translate(__('Unknown')), role='aut')
def _manifest_prune_invalid(self):
'''

Some files were not shown because too many files have changed in this diff Show More