New template operator 'field_inlist'. It operates directly on data in a multi-valued field instead of converting the field to a string then splitting it.

This commit is contained in:
Charles Haley 2024-10-21 16:47:43 +01:00
parent ea498af93c
commit b938f2500f
3 changed files with 30 additions and 8 deletions

View File

@ -245,7 +245,8 @@ General Program Mode
not_expression ::= [ '!' not_expression ]* | concatenate_expr
concatenate_expr::= compare_expr [ '&' compare_expr ]*
compare_expr ::= add_sub_expr [ compare_op add_sub_expr ]
compare_op ::= '==' | '!=' | '>=' | '>' | '<=' | '<' | 'in' | 'inlist' |
compare_op ::= '==' | '!=' | '>=' | '>' | '<=' | '<' |
'in' | 'inlist' | 'field_inlist' |
'==#' | '!=#' | '>=#' | '>#' | '<=#' | '<#'
add_sub_expr ::= times_div_expr [ add_sub_op times_div_expr ]*
add_sub_op ::= '+' | '-'
@ -406,8 +407,10 @@ Relational operators return ``'1'`` if the comparison is true, otherwise the emp
There are two forms of relational operators: string comparisons and numeric comparisons.
String comparisons do case-insensitive string comparison using lexical order. The supported string comparison operators are ``==``, ``!=``, ``<``, ``<=``, ``>``, ``>=``, ``in``, and ``inlist``.
For the ``in`` operator, the result of the left hand expression is interpreted as a regular expression pattern. The ``in`` operator is True if the value of left-hand regular expression matches the value of the right hand expression. The ``inlist`` operator is true if the left hand regular expression matches any one of the items in the right hand list where the items in the list are separated by commas. The matches are case-insensitive.
String comparisons do case-insensitive string comparison using lexical order. The supported string comparison operators are ``==``, ``!=``, ``<``, ``<=``, ``>``, ``>=``, ``in``, ``inlist``, and ``field_inlist``.
For the ``in`` operator, the result of the left hand expression is interpreted as a regular expression pattern. The ``in`` operator is True if the value of left-hand regular expression matches the value of the right hand expression.
The ``inlist`` operator is true if the left hand regular expression matches any one of the items in the right hand list where the items in the list are separated by commas. The ``field_inlist`` operator is true if the left hand regular expression matches any of the items in the field (column) named by the right hand expression, using the separator defined for the field. NB: the ``field_inlist`` operator requires the right hand expression to evaluate to a field name, while the ``inlist`` operator requires the right hand expression to evaluate to a string containing a comma-separated list. Because of this difference, ``field_inlist`` is substantially faster than ``inlist`` because no string conversions or list constructions are done. The regular expressions are case-insensitive.
The numeric comparison operators are ``==#``, ``!=#``, ``<#``, ``<=#``, ``>#``, ``>=#``. The left and right expressions must evaluate to numeric values with two exceptions: both the string value "None" (undefined field) and the empty string evaluate to the value zero.
@ -415,8 +418,10 @@ Examples:
* ``program: field('series') == 'foo'`` returns ``'1'`` if the book's series is 'foo', otherwise ``''``.
* ``program: 'f.o' in field('series')`` returns ``'1'`` if the book's series matches the regular expression ``f.o`` (e.g., `foo`, `Off Onyx`, etc.), otherwise ``''``.
* ``program: 'science' inlist field('#genre')`` returns ``'1'`` if any of the book's genres match the regular expression ``science``, e.g., `Science`, `History of Science`, `Science Fiction` etc., otherwise ``''``.
* ``program: '^science$' inlist field('#genre')`` returns ``'1'`` if any of the book's genres exactly match the regular expression ``^science$``, e.g., `Science`. The genres `History of Science` and `Science Fiction` don't match. If there isn't a match then returns ``''``.
* ``program: 'science' inlist field('#genre')`` returns ``'1'`` if any of the values retrieved from the book's genres match the regular expression ``science``, e.g., `Science`, `History of Science`, `Science Fiction` etc., otherwise ``''``.
* ``program: '^science$' inlist $#genre`` returns ``'1'`` if any of the book's genres exactly match the regular expression ``^science$``, e.g., `Science`. The genres `History of Science` and `Science Fiction` don't match. If there isn't a match then it returns ``''``.
* ``program: 'asimov' field_inlist 'authors'`` returns ``'1'`` if any author matches the regular expression ``asimov``, e.g., `Asimov, Isaac` or `Isaac Asimov`, otherwise ``''``.
* ``program: 'asimov$' field_inlist 'authors'`` returns ``'1'`` if any author matches the regular expression ``asimov$``, e.g., `Isaac Asimov`, otherwise ``''``. It doesn't match `Asimov, Isaac` because of the ``$`` anchor in the regular expression.
* ``program: if field('series') != 'foo' then 'bar' else 'mumble' fi`` returns ``'bar'`` if the book's series is not ``foo``. Otherwise it returns ``'mumble'``.
* ``program: if field('series') == 'foo' || field('series') == '1632' then 'yes' else 'no' fi`` returns ``'yes'`` if series is either ``'foo'`` or ``'1632'``, otherwise ``'no'``.
* ``program: if '^(foo|1632)$' in field('series') then 'yes' else 'no' fi`` returns ``'yes'`` if series is either ``'foo'`` or ``'1632'``, otherwise ``'no'``.

View File

@ -75,7 +75,7 @@ class TemplateHighlighter(QSyntaxHighlighter):
KEYWORDS_GPM = ['if', 'then', 'else', 'elif', 'fi', 'for', 'rof',
'separator', 'break', 'continue', 'return', 'in', 'inlist',
'def', 'fed', 'limit']
'field_inlist', 'def', 'fed', 'limit']
KEYWORDS_PYTHON = ["and", "as", "assert", "break", "class", "continue", "def",
"del", "elif", "else", "except", "exec", "finally", "for", "from",

View File

@ -652,7 +652,7 @@ class _Parser:
def compare_expr(self):
left = self.add_subtract_expr()
if (self.token_op_is_string_infix_compare() or
self.token_is('in') or self.token_is('inlist')):
self.token_is('in') or self.token_is('inlist') or self.token_is('field_inlist')):
operator = self.token()
return StringCompareNode(self.line_number, operator, left, self.add_subtract_expr())
if self.token_op_is_numeric_infix_compare():
@ -1398,11 +1398,27 @@ class _Interpreter:
[v.strip() for v in y.split(',') if v.strip()]))
}
def do_field_inlist(self, left, right, prog):
res = getattr(self.parent_book, right, None)
if res is None or not isinstance(res, (list, tuple, set, dict)):
self.error(_("Field '{0}' is either not a field or not a list").format(right), prog.line_number)
pat = re.compile(left, flags=re.I)
for v in res:
if re.search(pat, v):
return '1'
return ''
def do_node_string_infix(self, prog):
try:
left = self.expr(prog.left)
right = self.expr(prog.right)
res = '1' if self.INFIX_STRING_COMPARE_OPS[prog.operator](left, right) else ''
try:
res = '1' if self.INFIX_STRING_COMPARE_OPS[prog.operator](left, right) else ''
except KeyError:
if prog.operator == 'field_inlist':
res = self.do_field_inlist(left, right, prog)
else:
raise
if (self.break_reporter):
self.break_reporter(prog.node_name, res, prog.line_number)
return res
@ -1684,6 +1700,7 @@ class TemplateFormatter(string.Formatter):
(r'(separator|limit)\b', lambda x,t: (_Parser.LEX_KEYWORD, t)), # noqa
(r'(def|fed|continue)\b', lambda x,t: (_Parser.LEX_KEYWORD, t)), # noqa
(r'(return|inlist|break)\b', lambda x,t: (_Parser.LEX_KEYWORD, t)), # noqa
(r'(field_inlist)\b', lambda x,t: (_Parser.LEX_KEYWORD, t)), # noqa
(r'(\|\||&&|!|{|})', lambda x,t: (_Parser.LEX_OP, t)), # noqa
(r'[(),=;:\+\-*/&]', lambda x,t: (_Parser.LEX_OP, t)), # noqa
(r'-?[\d\.]+', lambda x,t: (_Parser.LEX_CONST, t)), # noqa