Full text search: imporve <ruby> parsing when indexing books

Following the spec of the <ruby> tag, it's better to ignore only the sub-tags
<rt>, <rp> and <rtc>, because the root text inside the <ruby> tag is what
we want indexing.
This commit is contained in:
un-pogaz 2024-03-09 08:01:28 +01:00
parent 25ad85a69c
commit 62916ee574

View File

@ -20,7 +20,7 @@ class SimpleContainer(ContainerBase):
tweak_mode = True
skipped_tags = frozenset({'style', 'title', 'script', 'head', 'img', 'svg', 'math', 'ruby'})
skipped_tags = frozenset({'style', 'title', 'script', 'head', 'img', 'svg', 'math', 'rt', 'rp', 'rtc'})
def tag_to_text(tag):