String changes

This commit is contained in:
Kovid Goyal 2021-11-23 21:12:52 +05:30
parent 2d66fa4e68
commit 1e12201376
No known key found for this signature in database
GPG Key ID: 06BC317B515ACE7C
7 changed files with 11 additions and 11 deletions

View File

@ -561,8 +561,8 @@ Now coming to author name sorting:
* Authors in the Tag browser are sorted by the sort value for the **authors**. Remember that this is different from the Author sort field for a book. * Authors in the Tag browser are sorted by the sort value for the **authors**. Remember that this is different from the Author sort field for a book.
* By default, this sort algorithm assumes that the author name is in ``First name Last name`` format and generates a ``Last name, First name`` sort value. * By default, this sort algorithm assumes that the author name is in ``First name Last name`` format and generates a ``Last name, First name`` sort value.
* You can change this algorithm by going to :guilabel:`Preferences->Advanced->Tweaks` and setting the :guilabel:`author_sort_copy_method` tweak. * You can change this algorithm by going to :guilabel:`Preferences->Advanced->Tweaks` and setting the :guilabel:`author_sort_copy_method` tweak.
* You can force calibre to recalculate the author sort values for every author by right clicking on any author and selecting :guilabel:`Manage authors`, then pushing the `Recalculate all author sort values` button. Do this after you have set the author_sort_copy_method tweak to what you want. * You can force calibre to recalculate the author sort values for every author by right clicking on any author and selecting :guilabel:`Manage authors`, then pushing the :guilabel:`Recalculate all author sort values` button. Do this after you have set the author_sort_copy_method tweak to what you want.
* You can force calibre to recalculate the author sort values for all books by using the bulk metadata edit dialog (select all books and click edit metadata, check the `Automatically set author sort` checkbox, then press OK.) * You can force calibre to recalculate the author sort values for all books by using the bulk metadata edit dialog (select all books and click edit metadata, check the :guilabel:`Automatically set author sort` checkbox, then press OK).
* When recalculating the author sort values for books, calibre uses the author sort values for each individual author. Therefore, ensure that the individual author sort values are correct before recalculating the books' author sort values. * When recalculating the author sort values for books, calibre uses the author sort values for each individual author. Therefore, ensure that the individual author sort values are correct before recalculating the books' author sort values.
* You can control whether the Tag browser display authors using their names or their sort values by setting the :guilabel:`categories_use_field_for_author_name` tweak in :guilabel:`Preferences->Advanced->Tweaks` * You can control whether the Tag browser display authors using their names or their sort values by setting the :guilabel:`categories_use_field_for_author_name` tweak in :guilabel:`Preferences->Advanced->Tweaks`

View File

@ -46,10 +46,10 @@ Next is the beginning of the really good stuff. Remember where I said that regul
Hey, neat! This is starting to make sense! Hey, neat! This is starting to make sense!
--------------------------------------------- ---------------------------------------------
I was hoping you'd say that. But brace yourself, now it gets even better! We just saw that using sets, we could match one of several characters at once. But you can even repeat a character or set, reducing the number of expressions needed to handle the above page number example to one. Yes, ONE! Excited? You should be! It works like this: Some so-called special characters, "+", "?" and "*", *repeat the single element preceding them*. (Element means either a single character, a character set, an escape sequence or a group (we'll learn about those last two later)- in short, any single entity in a regular expression.) These characters are called wildcards or quantifiers. To be more precise, "?" matches *0 or 1* of the preceding element, "*" matches *0 or more* of the preceding element and "+" matches *1 or more* of the preceding element. A few examples: The expression ``a?`` would match either "" (which is the empty string, not strictly useful in this case) or "a", the expression ``a*`` would match "", "a", "aa" or any number of a's in a row, and, finally, the expression ``a+`` would match "a", "aa" or any number of a's in a row (Note: it wouldn't match the empty string!). Same deal for sets: The expression ``[0-9]+`` would match *every integer number there is*! I know what you're thinking, and you're right: If you use that in the above case of matching page numbers, wouldn't that be the single one expression to match all the page numbers? Yes, the expression ``Page [0-9]+ of 423`` would match every page number in that book! I was hoping you'd say that. But brace yourself, now it gets even better! We just saw that using sets, we could match one of several characters at once. But you can even repeat a character or set, reducing the number of expressions needed to handle the above page number example to one. Yes, ONE! Excited? You should be! It works like this: Some so-called special characters, "+", "?" and "*", *repeat the single element preceding them*. (Element means either a single character, a character set, an escape sequence or a group (we'll learn about those last two later)- in short, any single entity in a regular expression). These characters are called wildcards or quantifiers. To be more precise, "?" matches *0 or 1* of the preceding element, "*" matches *0 or more* of the preceding element and "+" matches *1 or more* of the preceding element. A few examples: The expression ``a?`` would match either "" (which is the empty string, not strictly useful in this case) or "a", the expression ``a*`` would match "", "a", "aa" or any number of a's in a row, and, finally, the expression ``a+`` would match "a", "aa" or any number of a's in a row (Note: it wouldn't match the empty string!). Same deal for sets: The expression ``[0-9]+`` would match *every integer number there is*! I know what you're thinking, and you're right: If you use that in the above case of matching page numbers, wouldn't that be the single one expression to match all the page numbers? Yes, the expression ``Page [0-9]+ of 423`` would match every page number in that book!
.. note:: .. note::
A note on these quantifiers: They generally try to match as much text as possible, so be careful when using them. This is called "greedy behaviour"- I'm sure you get why. It gets problematic when you, say, try to match a tag. Consider, for example, the string ``"<p class="calibre2">Title here</p>"`` and let's say you'd want to match the opening tag (the part between the first pair of angle brackets, a little more on tags later). You'd think that the expression ``<p.*>`` would match that tag, but actually, it matches the whole string! (The character "." is another special character. It matches anything *except* linebreaks, so, basically, the expression ``.*`` would match any single line you can think of.) Instead, try using ``<p.*?>`` which makes the quantifier ``"*"`` non-greedy. That expression would only match the first opening tag, as intended. A note on these quantifiers: They generally try to match as much text as possible, so be careful when using them. This is called "greedy behaviour"- I'm sure you get why. It gets problematic when you, say, try to match a tag. Consider, for example, the string ``"<p class="calibre2">Title here</p>"`` and let's say you'd want to match the opening tag (the part between the first pair of angle brackets, a little more on tags later). You'd think that the expression ``<p.*>`` would match that tag, but actually, it matches the whole string! (The character "." is another special character. It matches anything *except* linebreaks, so, basically, the expression ``.*`` would match any single line you can think of). Instead, try using ``<p.*?>`` which makes the quantifier ``"*"`` non-greedy. That expression would only match the first opening tag, as intended.
There's actually another way to accomplish this: The expression ``<p[^>]*>`` will match that same opening tag- you'll see why after the next section. Just note that there quite frequently is more than one way to write a regular expression. There's actually another way to accomplish this: The expression ``<p[^>]*>`` will match that same opening tag- you'll see why after the next section. Just note that there quite frequently is more than one way to write a regular expression.
Well, these special characters are very neat and all, but what if I wanted to match a dot or a question mark? Well, these special characters are very neat and all, but what if I wanted to match a dot or a question mark?
@ -113,7 +113,7 @@ Let's begin with the conversion settings, which is really neat. In the :guilabel
<p class="calibre4"> It had only been two years since Addison v. Clark. <p class="calibre4"> It had only been two years since Addison v. Clark.
The court case gave us a revised version of what life was The court case gave us a revised version of what life was
(shamelessly ripped out of `this thread <https://www.mobileread.com/forums/showthread.php?t=75594">`_). You'd have to remove some of the tags as well. In this example, I'd recommend beginning with the tag ``<b class="calibre2">``, now you have to end with the corresponding closing tag (opening tags are ``<tag>``, closing tags are ``</tag>``), which is simply the next ``</b>`` in this case. (Refer to a good HTML manual or ask in the forum if you are unclear on this point.) The opening tag can be described using ``<b.*?>``, the closing tag using ``</b>``, thus we could remove everything between those tags using ``<b.*?>.*?</b>``. But using this expression would be a bad idea, because it removes everything enclosed by <b>- tags (which, by the way, render the enclosed text in bold print), and it's a fair bet that we'll remove portions of the book in this way. Instead, include the beginning of the enclosed string as well, making the regular expression ``<b.*?>\s*Generated\s+by\s+ABC\s+Amber\s+LIT.*?</b>`` The ``\s`` with quantifiers are included here instead of explicitly using the spaces as seen in the string to catch any variations of the string that might occur. Remember to check what calibre will remove to make sure you don't remove any portions you want to keep if you test a new expression. If you only check one occurrence, you might miss a mismatch somewhere else in the text. Also note that should you accidentally remove more or fewer tags than you actually wanted to, calibre tries to repair the damaged code after doing the removal. (shamelessly ripped out of `this thread <https://www.mobileread.com/forums/showthread.php?t=75594">`_). You'd have to remove some of the tags as well. In this example, I'd recommend beginning with the tag ``<b class="calibre2">``, now you have to end with the corresponding closing tag (opening tags are ``<tag>``, closing tags are ``</tag>``), which is simply the next ``</b>`` in this case. (Refer to a good HTML manual or ask in the forum if you are unclear on this point). The opening tag can be described using ``<b.*?>``, the closing tag using ``</b>``, thus we could remove everything between those tags using ``<b.*?>.*?</b>``. But using this expression would be a bad idea, because it removes everything enclosed by <b>- tags (which, by the way, render the enclosed text in bold print), and it's a fair bet that we'll remove portions of the book in this way. Instead, include the beginning of the enclosed string as well, making the regular expression ``<b.*?>\s*Generated\s+by\s+ABC\s+Amber\s+LIT.*?</b>`` The ``\s`` with quantifiers are included here instead of explicitly using the spaces as seen in the string to catch any variations of the string that might occur. Remember to check what calibre will remove to make sure you don't remove any portions you want to keep if you test a new expression. If you only check one occurrence, you might miss a mismatch somewhere else in the text. Also note that should you accidentally remove more or fewer tags than you actually wanted to, calibre tries to repair the damaged code after doing the removal.
Adding books Adding books
^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^
@ -128,7 +128,7 @@ The last part is regular expression :guilabel:`Search and replace` in metadata f
Well, that just about concludes the very short introduction to regular expressions. Hopefully I'll have shown you enough to at least get you started and to enable you to continue learning by yourself- a good starting point would be the `Python documentation for regexps <https://docs.python.org/library/re.html>`_. Well, that just about concludes the very short introduction to regular expressions. Hopefully I'll have shown you enough to at least get you started and to enable you to continue learning by yourself- a good starting point would be the `Python documentation for regexps <https://docs.python.org/library/re.html>`_.
One last word of warning, though: Regexps are powerful, but also really easy to get wrong. calibre provides really great testing possibilities to see if your expressions behave as you expect them to. Use them. Try not to shoot yourself in the foot. (God, I love that expression...) But should you, despite the warning, injure your foot (or any other body parts), try to learn from it. One last word of warning, though: Regexps are powerful, but also really easy to get wrong. calibre provides really great testing possibilities to see if your expressions behave as you expect them to. Use them. Try not to shoot yourself in the foot. (God, I love that expression...). But should you, despite the warning, injure your foot (or any other body parts), try to learn from it.
Quick reference Quick reference

View File

@ -809,7 +809,7 @@ class Cache:
will perform lossless compression, otherwise lossy compression. will perform lossless compression, otherwise lossy compression.
The progress callback will be called with the book_id and the old and new sizes The progress callback will be called with the book_id and the old and new sizes
for each book that has been processed. If an error occurs, the news size will for each book that has been processed. If an error occurs, the new size will
be a string with the error details. be a string with the error details.
''' '''
jpeg_quality = max(10, min(jpeg_quality, 100)) jpeg_quality = max(10, min(jpeg_quality, 100))

View File

@ -383,7 +383,7 @@ class KINDLE2(KINDLE):
' Since APNX files are usually deleted when a book is removed from' ' Since APNX files are usually deleted when a book is removed from'
' the Kindle, this is mostly useful when resending a book to the' ' the Kindle, this is mostly useful when resending a book to the'
' device which is already on the device (e.g. after making a' ' device which is already on the device (e.g. after making a'
' modification.)'), ' modification).'),
] ]
EXTRA_CUSTOMIZATION_DEFAULT = [ EXTRA_CUSTOMIZATION_DEFAULT = [

View File

@ -63,7 +63,7 @@ class ComicInput(InputFormatPlugin):
OptionRecommendation(name='dont_grayscale', recommended_value=False, OptionRecommendation(name='dont_grayscale', recommended_value=False,
help=_('Do not convert the image to grayscale (black and white)')), help=_('Do not convert the image to grayscale (black and white)')),
OptionRecommendation(name='comic_image_size', recommended_value=None, OptionRecommendation(name='comic_image_size', recommended_value=None,
help=_('Specify the image size as widthxheight pixels. Normally,' help=_('Specify the image size as width x height pixels, for example: 123x321. Normally,'
' an image size is automatically calculated from the output ' ' an image size is automatically calculated from the output '
'profile, this option overrides it.')), 'profile, this option overrides it.')),
OptionRecommendation(name='dont_add_comic_pages_to_toc', recommended_value=False, OptionRecommendation(name='dont_add_comic_pages_to_toc', recommended_value=False,

View File

@ -26,7 +26,7 @@ class DOCXOutput(OutputFormatPlugin):
'are %s') % PAGE_SIZES), 'are %s') % PAGE_SIZES),
OptionRecommendation(name='docx_custom_page_size', recommended_value=None, OptionRecommendation(name='docx_custom_page_size', recommended_value=None,
help=_('Custom size of the document. Use the form widthxheight ' help=_('Custom size of the document. Use the form width x height '
'EG. `123x321` to specify the width and height (in pts). ' 'EG. `123x321` to specify the width and height (in pts). '
'This overrides any specified page-size.')), 'This overrides any specified page-size.')),

View File

@ -47,7 +47,7 @@ class PDFOutput(OutputFormatPlugin):
'non default output profile is used. Default is letter. Choices ' 'non default output profile is used. Default is letter. Choices '
'are {}').format(', '.join(PAPER_SIZES))), 'are {}').format(', '.join(PAPER_SIZES))),
OptionRecommendation(name='custom_size', recommended_value=None, OptionRecommendation(name='custom_size', recommended_value=None,
help=_('Custom size of the document. Use the form widthxheight ' help=_('Custom size of the document. Use the form width x height '
'e.g. `123x321` to specify the width and height. ' 'e.g. `123x321` to specify the width and height. '
'This overrides any specified paper-size.')), 'This overrides any specified paper-size.')),
OptionRecommendation(name='preserve_cover_aspect_ratio', OptionRecommendation(name='preserve_cover_aspect_ratio',