Commit Graph

53530 Commits

Author SHA1 Message Date
Kovid Goyal 1ca7e3dd2d ... v9.9.0 2026-05-28 08:19:44 +05:30
Kovid Goyal 3b21cf80d2 version 9.9.0 2026-05-28 08:17:51 +05:30
Kovid Goyal 56e106cdf2 pep8 2026-05-27 22:38:02 +05:30
Kovid Goyal 91b1383331 Merge branch 'dependabot/github_actions/actions-bcb0c4251a' of https://github.com/kovidgoyal/calibre 2026-05-26 08:12:07 +05:30
dependabot[bot] dc8f141a22 build(deps): bump github/codeql-action in the actions group
Bumps the actions group with 1 update: [github/codeql-action](https://github.com/github/codeql-action).


Updates `github/codeql-action` from 4.35.4 to 4.35.5
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](https://github.com/github/codeql-action/compare/v4.35.4...v4.35.5)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-version: 4.35.5
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: actions
...

Signed-off-by: dependabot[bot] <support@github.com>
2026-05-25 22:10:08 +00:00
Kovid Goyal 5d6e23a02e Merge branch 'fix/archive-metadata-cover-paths' of https://github.com/M-Hassan-Raza/calibre 2026-05-24 09:01:00 +05:30
Kovid Goyal 754dc1bafa Merge branch 'fix/extra-file-glob-containment' of https://github.com/M-Hassan-Raza/calibre 2026-05-24 08:52:25 +05:30
Hassan Raza 812875083a Accept absolute archive-local cover paths 2026-05-23 23:22:02 +05:00
Hassan Raza 0319ed09f1 Test archive input path contracts 2026-05-23 23:18:04 +05:00
Hassan Raza 2eec10786e Honor TXTZ text file extensions 2026-05-23 23:17:21 +05:00
Hassan Raza 2811831fa8 Confine archive cover metadata paths 2026-05-23 23:16:42 +05:00
Hassan Raza c40aadc3ee Test confined extra file glob listing 2026-05-23 22:19:48 +05:00
Hassan Raza 956dd48191 Confine extra file glob patterns 2026-05-23 22:18:44 +05:00
Kovid Goyal 60906f300f Merge branch 'fix/rooted-path-containment' of https://github.com/M-Hassan-Raza/calibre 2026-05-23 22:27:31 +05:30
Hassan Raza fb2453a4cf Preserve path containment filesystem semantics 2026-05-23 20:02:21 +05:00
Hassan Raza 191acdf395 Confine extra file paths to book directories 2026-05-23 20:02:00 +05:00
Hassan Raza c2da43a91a Use rooted paths for served local files 2026-05-23 20:02:00 +05:00
Hassan Raza e1cdb70dc2 Add rooted path containment helper 2026-05-23 20:02:00 +05:00
Kovid Goyal 387f1d05fa Merge branch 'count-pages-fixed-layout' of https://github.com/un-pogaz/calibre 2026-05-23 07:02:39 +05:30
Kovid Goyal 7d2f1597ea Merge branch 'fix/http-connection-header-tokens' of https://github.com/M-Hassan-Raza/calibre 2026-05-23 06:52:37 +05:30
un-pogaz 332ccea5c8 pages count: support fixed-layout 2026-05-22 20:48:01 +02:00
Hassan Raza 289b77463a Parse Connection header tokens 2026-05-22 22:58:10 +05:00
Kovid Goyal cb2b1d195f Merge branch 'fix/http-content-length-framing' of https://github.com/M-Hassan-Raza/calibre 2026-05-22 22:58:51 +05:30
Hassan Raza 74d8ab0c1b Reject invalid HTTP Content-Length framing 2026-05-22 22:21:25 +05:00
Kovid Goyal f31ee236ce Merge branch 'fix/copy-to-library-move-duplicate' of https://github.com/M-Hassan-Raza/calibre 2026-05-22 22:33:02 +05:30
Hassan Raza 648343f888 Fix ignored duplicate moves in content server 2026-05-22 12:59:03 +05:00
Kovid Goyal a5dd4a47cd Merge branch 'fix-content-server-restrictions' of https://github.com/M-Hassan-Raza/calibre 2026-05-22 07:40:28 +05:30
Hassan Raza 5a15ed3d5a Respect content server book restrictions 2026-05-21 22:21:33 +05:00
Kovid Goyal 66501a6ae7 Merge branch 'master' of https://github.com/unkn0w7n/calibre 2026-05-21 14:51:46 +05:30
unkn0w7n b8b56c3607 Update indian_express.recipe 2026-05-21 14:47:34 +05:30
unkn_wn b7b876196e Update business_standard_print.recipe 2026-05-21 14:46:58 +05:30
Kovid Goyal 92e0132d62 Bump dependency for CVE 2026-05-20 20:38:48 +05:30
Kovid Goyal 9cd210dad0 pep8 2026-05-19 07:29:13 +05:30
Kovid Goyal 6dbc00d054 Merge branch 'ap-filter-by-publish-date' of https://github.com/claybdavis/calibre 2026-05-19 07:28:41 +05:30
Kovid Goyal 80f379147d Merge branch 'newcriterion-wp-migration' of https://github.com/claybdavis/calibre 2026-05-19 07:27:51 +05:30
Kovid Goyal d8490c2208 Merge branch 'bbc-sport-headline-block-fix' of https://github.com/claybdavis/calibre 2026-05-19 07:26:53 +05:30
Kovid Goyal 19d2488ca5 Merge branch 'bbc-drop-dead-feeds' of https://github.com/claybdavis/calibre 2026-05-19 07:25:58 +05:30
Kovid Goyal 806a3a5bfc Merge branch 'dependabot/github_actions/actions-8abaa2cbc6' of https://github.com/kovidgoyal/calibre 2026-05-19 07:24:30 +05:30
dependabot[bot] b450e0e2ca Bump github/codeql-action from 4.35.3 to 4.35.4 in the actions group
Bumps the actions group with 1 update: [github/codeql-action](https://github.com/github/codeql-action).


Updates `github/codeql-action` from 4.35.3 to 4.35.4
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](https://github.com/github/codeql-action/compare/v4.35.3...v4.35.4)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-version: 4.35.4
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: actions
...

Signed-off-by: dependabot[bot] <support@github.com>
2026-05-19 01:37:27 +00:00
claybdavis 821f82b730 bbc: drop three dead/stale feed entries
Audit (2026-05-18) of the BBC News feed list found three entries that
no longer produce content:

* Special Reports (https://feeds.bbci.co.uk/news/special_reports/rss.xml)
  HTTP/2 404. Wayback's last successful capture is 2024-07-23, so the
  URL has been dead for roughly two years.
* Also in the News (https://feeds.bbci.co.uk/news/also_in_the_news/rss.xml)
  HTTP/2 404. Wayback has no successful captures of this URL at all.
* Magazine (https://feeds.bbci.co.uk/news/magazine/rss.xml)
  301-redirects to /news/stories/rss.xml which still returns 200 OK,
  but the content there has been stale since December 2022. The
  endpoint is alive; the section is abandoned.

Because the recipes set remove_empty_feeds=True these three have been
silently swallowed on every fetch, costing four wasted HTTP calls per
run (the Magazine redirect doubles up). Dropping them cleans the
active feed list without changing what readers actually receive.

bbc.recipe had all three active entries; bbc_fast.recipe only carried
Magazine. Both files patched accordingly.

The ~50 commented-out legacy feed URLs in the same block are NOT
touched here -- that is a separate cleanup.
2026-05-18 16:10:22 -05:00
claybdavis 507feb15fa bbc: emit <h1> for Sport articles (handle new headline block-type)
Sport articles fetched via recipes/bbc.recipe and recipes/bbc_fast.recipe
were shipping with no <h1>, because BBC restructured the
window.__INITIAL_DATA__ JSON. article['headline'] is now None for Sport,
and the headline lives either in a new 'headline' block-type or — for
the 'high-impact' layout — nested under a 'topper' block's
model.heading.blocks list. The previous parse_article_json loop only
branched on 'image' and 'text', so neither variant produced anything.

Fix: prefer the plain-text article['metadata']['seoHeadline'] when the
legacy article['headline'] field is empty, and as a defensive fallback
extract the headline from a 'headline' or 'topper' block via a small
extract_text_block_plaintext helper. Verified against live Sport URLs
covering both block-type variants; legacy News articles that still
populate article['headline'] are unaffected.

bbc_fast.recipe carries an identical copy of parse_article_json, so the
same patch is applied to both files.
2026-05-18 16:00:45 -05:00
claybdavis a8797f05f1 newcriterion: update parse_index and login for WordPress migration
newcriterion.com moved from October CMS to WordPress. The old recipe
looked for <div id="main"> and an /issues/YYYY/M/ URL pattern, so
parse_index crashed with AttributeError: 'NoneType' object has no
attribute 'findAll' against the new layout.

Rewrites parse_index for the new markup: issue URLs of the form
/issues/<month>-<year>/, a <div class="issue-layout"> container, and
<article class="article-display"> blocks with <h2><a> for title+URL
and <p class="post-excerpt"> for the dek.

Also ports get_browser from the old October-CMS XHR signin endpoint to
standard wp-login.php form submission, and drops the now-unused
urlencode, mechanize.Request, and re imports.
2026-05-18 14:50:10 -05:00
claybdavis bdf0679ecf ap: filter articles by article:published_time meta tag
AP has no RSS feeds and parse_index had no date logic, so the framework's
oldest_article knob was a no-op and cross-day duplicates accumulated
indefinitely.

This fix fetches each candidate article during indexing and reads
<meta property="article:published_time"> to populate both the article's
timestamp (which oldest_article actually filters on) and a formatted date
string for the TOC. Cached per-URL across the front-page walk so duplicate
links are fetched once. Articles whose published_time can't be read are
skipped with a warning rather than kept dateless.

Sets oldest_article = 1 (AP publishes constantly). The trade-off is roughly
30-60s extra wall time and a doubling of HTTP volume per run, paid for by
dropping the ~24 stale articles per consecutive-day fetch.

Same per-URL-fetch idiom as #3132 (latimes og:description).
2026-05-18 13:27:20 -05:00
Kovid Goyal 8544091c82 Content server: Apply null metadata when serving book files. Matches behavior of save to disk. Fixes #2152879 [tags not deleted in ePub by content server](https://bugs.launchpad.net/calibre/+bug/2152879) 2026-05-18 14:27:32 +05:30
Kovid Goyal 68c567b372 Bump dependency for CVE 2026-05-16 13:32:20 +05:30
Kovid Goyal 111abb9a43 Merge branch 'propublica-drop-newsroom-blurb' of https://github.com/claybdavis/calibre 2026-05-14 11:16:44 +05:30
Kovid Goyal 23b27a71be Merge branch 'latimes-fetch-og-description' of https://github.com/claybdavis/calibre 2026-05-14 11:15:54 +05:30
Kovid Goyal af5f132bdf Merge branch 'latimes-drop-follow-link' of https://github.com/claybdavis/calibre 2026-05-14 11:15:35 +05:30
Kovid Goyal 3c800802d5 Merge branch 'latimes-narrow-date-and-fix-images' of https://github.com/claybdavis/calibre 2026-05-14 11:14:17 +05:30
Kovid Goyal b07017934c Merge branch 'wapo-print-tag-subhead-and-caption' of https://github.com/claybdavis/calibre 2026-05-14 11:13:20 +05:30