43481 Commits

Author SHA1 Message Date
Charles Haley
7e49b481e9 Bug 1932984: AttributeError on 'Add-subcategory to <main-category>' 2021-06-19 09:17:09 +01:00
Kovid Goyal
52a87af143
Bounds check access to byte_offsets 2021-06-19 13:34:29 +05:30
Kovid Goyal
d9c0da9ec3
... 2021-06-19 13:13:03 +05:30
Kovid Goyal
6e62ccab38
Forgot to test boolean operators in queries 2021-06-19 11:50:46 +05:30
Kovid Goyal
e0dad27caa
tests for fts query syntax 2021-06-19 11:47:52 +05:30
Kovid Goyal
310a1a7d2e
Add FTS tokenizer tests with Chinese 2021-06-19 10:54:34 +05:30
Kovid Goyal
ef78b19912
Also hold global lock when constructing a tokenizer and setting its current_ui_language 2021-06-18 21:40:14 +05:30
Kovid Goyal
d9b773bd19
Ensure tokenizer tests are run with a fixed UI language 2021-06-18 21:38:15 +05:30
Kovid Goyal
c86f439e64
... 2021-06-18 21:16:59 +05:30
Kovid Goyal
6ef1ec1656
Add currency and other symbols to allowed token characters 2021-06-18 21:04:31 +05:30
Kovid Goyal
2cf31be2ba
Use ICU Word BreakIterator for tokenization 2021-06-18 18:06:15 +05:30
Kovid Goyal
879262929e
Merge branch 'master' of https://github.com/MorganSeltzer000/calibre
E-book viewer: Fix scrolling backwards by screen-fulls not working
with very large page margins.
2021-06-18 07:50:31 +05:30
Kovid Goyal
febc066142
A function to ensure lang specific iterators 2021-06-18 07:43:10 +05:30
Morgan Seltzer
501d6d0cf2 Fixed Pageup Occasionally Failing
Before, pageup failed when the page margins were greater than half the
screen width, because previous_screen_location() went backward by
screen_inline, which did not account for the margins but worked most of
the time due to later rounding. Now this has been fixed.

Signed-off-by: Morgan Seltzer <MorganSeltzer000@gmail.com>
2021-06-17 12:42:18 -05:00
Kovid Goyal
87b85cac39
Start work on ICU word break iterator based tokenization 2021-06-17 15:56:12 +05:30
Kovid Goyal
0cb9637e8c
... 2021-06-17 14:38:00 +05:30
Kovid Goyal
d818bc17b8
... 2021-06-17 12:12:59 +05:30
Kovid Goyal
6302937c4f
Allow directly testing the tokenizer 2021-06-17 12:10:24 +05:30
Kovid Goyal
4127117e8a
Add a UI language based iterator 2021-06-17 09:53:02 +05:30
Kovid Goyal
06d34a2df9
Add a test for snippets 2021-06-17 08:31:16 +05:30
Kovid Goyal
53b8bed17a
Function to get available locales for break iteration 2021-06-17 07:25:15 +05:30
Kovid Goyal
f138d716a5
Merge branch 'python3.10' of https://github.com/swt2c/calibre 2021-06-17 06:16:25 +05:30
Scott Talbert
2e272a39d0 Fix building with Python 3.10 2021-06-16 14:19:40 -04:00
Kovid Goyal
6773b36a42
Forgot to add header to extension definition 2021-06-16 21:57:44 +05:30
Kovid Goyal
584eacdee4
E-book viewer: Fix font sizes specified in absolute units not being honored in locales where the decimal separator is not the period. Fixes #1932152 [The e-book viewer ignores font-size property when using some absolute lenght units](https://bugs.launchpad.net/calibre/+bug/1932152) 2021-06-16 21:55:51 +05:30
Kovid Goyal
12e9769b4b
Dont resize scratch unneccessarily 2021-06-16 21:40:17 +05:30
Kovid Goyal
22af8ab304
silence compiler warning 2021-06-16 21:38:32 +05:30
Kovid Goyal
9e77e2848e
... 2021-06-16 20:39:45 +05:30
Kovid Goyal
03b7feb507
Avoid ipython repeated exception when not available 2021-06-16 19:47:54 +05:30
Kovid Goyal
a37c14499c
Fix building of sqlite_extension on ancient Linux 2021-06-16 17:14:31 +05:30
Kovid Goyal
d8595e5bf5
Fix ICU build on Windows 2021-06-16 17:02:07 +05:30
Kovid Goyal
ae25a1f425
Also add test without diacritics removal 2021-06-16 16:16:03 +05:30
Kovid Goyal
bbee5b0acb
Implement diacritics removal in the new tokenizer 2021-06-16 14:54:15 +05:30
Kovid Goyal
ab313c836f
Implement the unicode61 tokenizer with ICU
Still have to implement removal of diacritics
2021-06-16 12:51:43 +05:30
Kovid Goyal
c9c1029d02
Merge branch 'hindu-patch' of https://github.com/shivaprsd/calibre 2021-06-15 17:45:04 +05:30
Shiva Prasad
b0d4c388d6
Recipe: make Hindu better resemble print edition
* Remove redundant date and timestamps cluttering every article
* Place intro immediately beneath the heading, as in print edition
|- duplicate intro is now removed using CSS
* Visual styling
2021-06-15 17:24:59 +05:30
Kovid Goyal
adf810cae6
Parse tokenizer options 2021-06-15 13:12:24 +05:30
Kovid Goyal
79ea88ddb8
Basic test to tokenizer 2021-06-15 13:07:18 +05:30
Kovid Goyal
c819fcb870
A simple ASCII tokenizer to start with 2021-06-15 11:44:31 +05:30
Kovid Goyal
268d1d991c
Update People Daily 2021-06-15 08:13:16 +05:30
Kovid Goyal
ec3744c661
Merge branch 'patch-1' of https://github.com/justinuang/calibre 2021-06-15 08:09:00 +05:30
Justin Uang
5c5ac6e2a5
Fix haodoo reader to properly pick encoding in python3
It was broken before because the comparison on line 94 would compare header.ident (a unicode string) to BPDB_IDENT (a byte string), which is not comparable in python3.
2021-06-14 21:21:33 -04:00
Kovid Goyal
0bf9cb12fe
Write boilerplate for tokenizer to connect it to sqlite 2021-06-14 16:23:08 +05:30
Kovid Goyal
395a052ff8
Bump sqlite version 2021-06-14 10:42:35 +05:30
Kovid Goyal
6d845cfa37
Generate db of compile commands for use by tools when building 2021-06-14 09:02:55 +05:30
Kovid Goyal
0b38d385e2
Dont use designated initializers 2021-06-14 08:47:08 +05:30
Kovid Goyal
a7da47b922
Possible fix for building sqlite_extension.cpp on windows 2021-06-14 08:34:50 +05:30
Kovid Goyal
e4b13d4ccb
Start work on ICU tokenizer for FTS 2021-06-14 08:23:10 +05:30
Kovid Goyal
4df402cacf
Merge branch 'hindu-cover' of https://github.com/shivaprsd/calibre 2021-06-13 21:05:56 +05:30
Shiva Prasad
5df3b224a6
Recipe: add cover_url to The Hindu
Gets today's front page image displayed in Hindu ePaper website.

Also add some styles to the lead image and caption.
2021-06-13 20:56:18 +05:30