Charles Haley
7e49b481e9
Bug 1932984: AttributeError on 'Add-subcategory to <main-category>'
2021-06-19 09:17:09 +01:00
Kovid Goyal
52a87af143
Bounds check access to byte_offsets
2021-06-19 13:34:29 +05:30
Kovid Goyal
d9c0da9ec3
...
2021-06-19 13:13:03 +05:30
Kovid Goyal
6e62ccab38
Forgot to test boolean operators in queries
2021-06-19 11:50:46 +05:30
Kovid Goyal
e0dad27caa
tests for fts query syntax
2021-06-19 11:47:52 +05:30
Kovid Goyal
310a1a7d2e
Add FTS tokenizer tests with Chinese
2021-06-19 10:54:34 +05:30
Kovid Goyal
ef78b19912
Also hold global lock when constructing a tokenizer and setting its current_ui_language
2021-06-18 21:40:14 +05:30
Kovid Goyal
d9b773bd19
Ensure tokenizer tests are run with a fixed UI language
2021-06-18 21:38:15 +05:30
Kovid Goyal
c86f439e64
...
2021-06-18 21:16:59 +05:30
Kovid Goyal
6ef1ec1656
Add currency and other symbols to allowed token characters
2021-06-18 21:04:31 +05:30
Kovid Goyal
2cf31be2ba
Use ICU Word BreakIterator for tokenization
2021-06-18 18:06:15 +05:30
Kovid Goyal
879262929e
Merge branch 'master' of https://github.com/MorganSeltzer000/calibre
...
E-book viewer: Fix scrolling backwards by screen-fulls not working
with very large page margins.
2021-06-18 07:50:31 +05:30
Kovid Goyal
febc066142
A function to ensure lang specific iterators
2021-06-18 07:43:10 +05:30
Morgan Seltzer
501d6d0cf2
Fixed Pageup Occasionally Failing
...
Before, pageup failed when the page margins were greater than half the
screen width, because previous_screen_location() went backward by
screen_inline, which did not account for the margins but worked most of
the time due to later rounding. Now this has been fixed.
Signed-off-by: Morgan Seltzer <MorganSeltzer000@gmail.com>
2021-06-17 12:42:18 -05:00
Kovid Goyal
87b85cac39
Start work on ICU word break iterator based tokenization
2021-06-17 15:56:12 +05:30
Kovid Goyal
0cb9637e8c
...
2021-06-17 14:38:00 +05:30
Kovid Goyal
d818bc17b8
...
2021-06-17 12:12:59 +05:30
Kovid Goyal
6302937c4f
Allow directly testing the tokenizer
2021-06-17 12:10:24 +05:30
Kovid Goyal
4127117e8a
Add a UI language based iterator
2021-06-17 09:53:02 +05:30
Kovid Goyal
06d34a2df9
Add a test for snippets
2021-06-17 08:31:16 +05:30
Kovid Goyal
53b8bed17a
Function to get available locales for break iteration
2021-06-17 07:25:15 +05:30
Kovid Goyal
f138d716a5
Merge branch 'python3.10' of https://github.com/swt2c/calibre
2021-06-17 06:16:25 +05:30
Scott Talbert
2e272a39d0
Fix building with Python 3.10
2021-06-16 14:19:40 -04:00
Kovid Goyal
6773b36a42
Forgot to add header to extension definition
2021-06-16 21:57:44 +05:30
Kovid Goyal
584eacdee4
E-book viewer: Fix font sizes specified in absolute units not being honored in locales where the decimal separator is not the period. Fixes #1932152 [The e-book viewer ignores font-size property when using some absolute lenght units]( https://bugs.launchpad.net/calibre/+bug/1932152 )
2021-06-16 21:55:51 +05:30
Kovid Goyal
12e9769b4b
Dont resize scratch unneccessarily
2021-06-16 21:40:17 +05:30
Kovid Goyal
22af8ab304
silence compiler warning
2021-06-16 21:38:32 +05:30
Kovid Goyal
9e77e2848e
...
2021-06-16 20:39:45 +05:30
Kovid Goyal
03b7feb507
Avoid ipython repeated exception when not available
2021-06-16 19:47:54 +05:30
Kovid Goyal
a37c14499c
Fix building of sqlite_extension on ancient Linux
2021-06-16 17:14:31 +05:30
Kovid Goyal
d8595e5bf5
Fix ICU build on Windows
2021-06-16 17:02:07 +05:30
Kovid Goyal
ae25a1f425
Also add test without diacritics removal
2021-06-16 16:16:03 +05:30
Kovid Goyal
bbee5b0acb
Implement diacritics removal in the new tokenizer
2021-06-16 14:54:15 +05:30
Kovid Goyal
ab313c836f
Implement the unicode61 tokenizer with ICU
...
Still have to implement removal of diacritics
2021-06-16 12:51:43 +05:30
Kovid Goyal
c9c1029d02
Merge branch 'hindu-patch' of https://github.com/shivaprsd/calibre
2021-06-15 17:45:04 +05:30
Shiva Prasad
b0d4c388d6
Recipe: make Hindu better resemble print edition
...
* Remove redundant date and timestamps cluttering every article
* Place intro immediately beneath the heading, as in print edition
|- duplicate intro is now removed using CSS
* Visual styling
2021-06-15 17:24:59 +05:30
Kovid Goyal
adf810cae6
Parse tokenizer options
2021-06-15 13:12:24 +05:30
Kovid Goyal
79ea88ddb8
Basic test to tokenizer
2021-06-15 13:07:18 +05:30
Kovid Goyal
c819fcb870
A simple ASCII tokenizer to start with
2021-06-15 11:44:31 +05:30
Kovid Goyal
268d1d991c
Update People Daily
2021-06-15 08:13:16 +05:30
Kovid Goyal
ec3744c661
Merge branch 'patch-1' of https://github.com/justinuang/calibre
2021-06-15 08:09:00 +05:30
Justin Uang
5c5ac6e2a5
Fix haodoo reader to properly pick encoding in python3
...
It was broken before because the comparison on line 94 would compare header.ident (a unicode string) to BPDB_IDENT (a byte string), which is not comparable in python3.
2021-06-14 21:21:33 -04:00
Kovid Goyal
0bf9cb12fe
Write boilerplate for tokenizer to connect it to sqlite
2021-06-14 16:23:08 +05:30
Kovid Goyal
395a052ff8
Bump sqlite version
2021-06-14 10:42:35 +05:30
Kovid Goyal
6d845cfa37
Generate db of compile commands for use by tools when building
2021-06-14 09:02:55 +05:30
Kovid Goyal
0b38d385e2
Dont use designated initializers
2021-06-14 08:47:08 +05:30
Kovid Goyal
a7da47b922
Possible fix for building sqlite_extension.cpp on windows
2021-06-14 08:34:50 +05:30
Kovid Goyal
e4b13d4ccb
Start work on ICU tokenizer for FTS
2021-06-14 08:23:10 +05:30
Kovid Goyal
4df402cacf
Merge branch 'hindu-cover' of https://github.com/shivaprsd/calibre
2021-06-13 21:05:56 +05:30
Shiva Prasad
5df3b224a6
Recipe: add cover_url to The Hindu
...
Gets today's front page image displayed in Hindu ePaper website.
Also add some styles to the lead image and caption.
2021-06-13 20:56:18 +05:30