5788 Commits

Author SHA1 Message Date
Markus Heiser
4fb6105d69
[fix] revision of utils.HTMLTextExtractor (#5125)
Related:

- https://github.com/searxng/searxng/pull/5073#issuecomment-3196282632
2025-08-18 16:30:51 +02:00
Ishbir Singh
b606103352
[fix] reuters: published date not parsed correctly in some cases
FIxes publishedDate format in reuters engine to encompass ISO 8601 times both with and without milliseconds.
Why is this change important?

Previously, the engine would sometimes fail saying:

2025-08-12 21:13:23,091 ERROR:searx.engines.reuters: exception : time data '2024-04-15T19:08:30.833Z' does not match format '%Y-%m-%dT%H:%M:%SZ'

Traceback (most recent call last):

...
  File "/usr/local/searxng/searx/engines/reuters.py", line 87, in response

    publishedDate=datetime.strptime(result["display_time"], "%Y-%m-%dT%H:%M:%SZ"),

                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
...

Note that most queries seem to work with Reuters, but there are some results that have the additional milliseconds and fail. Regardless, the change is backwards compatible as both the formats (with and without the ms) should now parse correctly.
2025-08-16 15:50:38 +00:00
Zhijie He
6b1516d6ad
[fix] baidu captcha detection (#5111)
Add Baidu Captcha detection to reduce `JSONDecodeError` error

Baidu will redirect to `wappass.baidu.com` and return a captcha challenge.
Current behavior will get the data from `wappass.baidu.com` then return a
`json.decoder.JSONDecodeError` error.
2025-08-12 15:18:46 +02:00
Markus Heiser
6cccb46f2b
[fix] replace X-Scheme by X-Forwarded-Proto header (#5107)
The HTTP X-Forwarded-Proto (XFP) request header is a *de-facto* standard header
for identifying the protocol (HTTP or HTTPS) that a client used to connect to a
proxy or load balancer.[1]

The ``X-Scheme`` header was added 10 years ago, why ``X-Scheme`` was used back
then and not ``X-Forwarded-Proto``, nobody knows today / possibly because
``X-Forwarded-Proto`` wasn't a *de-facto* standard back then.

[1] https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/X-Forwarded-Proto
[2] https://github.com/searx/searx/commit/6ef7c3276
2025-08-10 13:05:40 +02:00
Markus Heiser
a0dd416e8a
[fix] use X-Forwarded-Proto header if the URL scheme is unknown (#5106)
The HTTP X-Forwarded-Proto (XFP) request header is a de-facto standard header
for identifying the protocol (HTTP or HTTPS) that a client used to connect to a
proxy or load balancer.[1]

In our documentation[2] we recommend to set the `X-Scheme` header. This header
is not required if the `server.base_url` is set correctly.[3]

If none of these URL scheme details exist, then the header X-Forwarded-Proto is
evaluated as a third alternative.

[1] https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/X-Forwarded-Proto
[2] https://docs.searxng.org/admin/installation-apache.html#apache-s-searxng-site
[3] https://docs.searxng.org/admin/settings/settings_server.html

Closes: https://github.com/searxng/searxng/issues/5105
2025-08-10 11:08:57 +02:00
Markus Heiser
935f3fe332
[fix] limiter: trusted proxies doc-string (#5104) 2025-08-09 23:30:48 +02:00
Ivan Gabaldon
ce8929cabe
[mod] limiter: trusted proxies (#4911)
Replaces `x_for` functionality with `trusted_proxies`. This allows defining
which IP / ranges to trust extracting the client IP address from X-Forwarded-For
and X-Real-IP headers.

We don't know if the proxy chain will give us the proper client
address (REMOTE_ADDR in the WSGI environment), so we rely on reading the headers
of the proxy before SearXNG (if there is one, in that case it must be added to
trusted_proxies) hoping it has done the proper checks. In case a proxy in the
chain does not check the client address correctly, integrity is compromised and
this should be fixed by whoever manages the proxy, not us.

Closes:

- https://github.com/searxng/searxng/issues/4940
- https://github.com/searxng/searxng/issues/4939
- https://github.com/searxng/searxng/issues/4907
- https://github.com/searxng/searxng/issues/3632
- https://github.com/searxng/searxng/issues/3191
- https://github.com/searxng/searxng/issues/1237

Related:

- https://github.com/searxng/searxng-docker/issues/386
- https://github.com/inetol-infrastructure/searxng-container/issues/81
2025-08-09 23:03:30 +02:00
Markus Heiser
341d718c7f
[fix] duckduckgo weather: add type hints and fix WEATHERKIT_TO_CONDITION (#5101) 2025-08-09 12:24:19 +02:00
Austin-Olacsi
cf5061dc70
[feat] engines: add Marginalia (#5087)
To get an API key follow instructions at [1].

[1] https://about.marginalia-search.com/article/api/

Related (historical ordered):
- https://github.com/searxng/searxng/issues/1620
- https://github.com/searxng/searxng/issues/1673
- https://github.com/searxng/searxng/pull/1627
- https://github.com/searxng/searxng/pull/2489

Closes:
- https://github.com/searxng/searxng/issues/3034

Co-authored-by: Markus Heiser <markus.heiser@darmarit.de>
2025-08-09 08:38:11 +02:00
github-actions[bot]
5e7109cd26
[l10n] update translations from Weblate (#5096)
0fbf5aa2d - 2025-08-07 - alexgabi <alexgabi@noreply.codeberg.org>
d18d3ed1c - 2025-08-07 - return42 <return42@noreply.codeberg.org>
7927a63a0 - 2025-08-06 - pikzim <pikzim@noreply.codeberg.org>
27c8b4013 - 2025-08-05 - nhthinh <nhthinh@noreply.codeberg.org>
83262e748 - 2025-08-04 - IcewindX <icewindx@noreply.codeberg.org>
2025-08-08 17:19:22 +02:00
Bnyro
761b74e8c9
[fix] legacy results: published date missing (#5093)
The `publishedDate` has always been `None` before that change, which
causes that there are no `publishedDate`s visible for any result.
2025-08-08 12:22:00 +02:00
Bnyro
25c327904a [fix] tagesschau: crash if there's no video stream available
Sometimes, there's only an `adaptivestreaming` field in `streams`, which
is usually an m3u8 file. That's however not supported by the video player
of any browser, so we can't use and must build a different url instead.
2025-08-08 07:19:12 +00:00
Bnyro
612b76b75e
[fix] webapp.py: default_http_headers not parsed as strings (#5094)
WhiteNoise requires all headers to be strings, however it's common to use
other primitive types (e.g. numbers) in the header, e.g. `X-XSS-Protection: 0`.
Thus, we must convert all types of values (i.e. numbers) to strings.

- closes https://github.com/searxng/searxng/issues/5091
2025-08-07 20:50:31 +02:00
Ivan Gabaldon
3de7a6da2d
[enh] container: tidy builds (#5086)
Building the container currently does not work properly.
When rebuilding several times with `make container`, `version_frozen.py`
is recreated, which wouldn't be an issue if the file’s timestamp was constant.
Now, when creating `version_frozen.py`, it will have the same timestamp as the
commit when it was created. (`version_frozen.py` is moved to a dedicated layer).

Reusing "builder" cache when building "dist" could be slow
(CD reports 2 seconds, but locally I've seen it take up to 10 seconds),
so the Dockerfile is now split and we save a couple steps
by importing the "builder" image directly.

The last changes made it possible to remove the layer cache in "builder",
since the overhead is now greater than building the layers from scratch.

Until now, all "dist" layers were squashed into a single layer,
which in most cases is a good idea
(except for storage/delivery pricing/overhead), but in our case,
since we manage the entire pipeline, we can ignore this
and share layers between builds.
This means (for example) that if we change files unrelated to the container
in several consecutive commits (documentation changes), we don't have to push
the entire image to registry, but only the different layers
(`version_frozen.py` in this example).
The same applies when pulling, as only the layers that have changed
compared to the local layers will be downloaded (that's the theory,
we'll see if this works as expected or if we need to tweak something else).
2025-08-07 10:46:26 +02:00
Bnyro
94256e3383 [feat] duckduckgo weather: migrate to new weather engine template
- not 100% sure about the condition code mapping, there are no real matches for most of the codes from Apple WeatherKit to the weather codes we have in SearXNG
- related: https://github.com/searxng/searxng/issues/4885
2025-08-06 14:09:23 +02:00
Markus Heiser
2e62eb5d68
[fix] engine yummly: website were taken offline in December 2024 (#5080)
The app and website were taken offline in December 2024, with the latter
pointing to KitchenAid's US website. [1]

[1] https://en.wikipedia.org/wiki/Yummly

Closes: https://github.com/searxng/searxng/issues/5079
2025-08-03 10:49:14 +02:00
Markus Heiser
664aab0ec9
[fix] CI task "update_engine_traits.py" fails (#5069)
* [fix] CI task "update_engine_traits.py" fails

To catch all problems with an HTTP request, the more general class
``httpx.HTTPError`` must be caught, for your test use::

    $ ./manage dev.env
    $ python ./searxng_extra/update/update_engine_traits.py

Closes: https://github.com/searxng/searxng/issues/5068

* [data] update searx.data - update_engine_traits.py
2025-08-01 12:08:27 +02:00
github-actions[bot]
c2d4e3c49a
[l10n] update translations from Weblate (#5076)
17e9fcd68 - 2025-07-29 - musabustun <musabustun@noreply.codeberg.org>
90b302e3e - 2025-07-29 - return42 <return42@noreply.codeberg.org>
023a22292 - 2025-07-29 - return42 <return42@noreply.codeberg.org>
17d37ede6 - 2025-07-30 - gkalathas <gkalathas@noreply.codeberg.org>
3c64c165f - 2025-07-29 - return42 <return42@noreply.codeberg.org>
d8f65cdc7 - 2025-07-26 - IcewindX <icewindx@noreply.codeberg.org>
2025-08-01 10:02:49 +02:00
benpiano800
46f41d2138 [feat] statistics answerer: add the ability to calculate the range of a set 2025-07-31 20:13:24 +02:00
github-actions[bot]
40b78ad06c
[data] update searx.data - update_wikidata_units.py (#5062) 2025-07-29 07:26:01 +02:00
github-actions[bot]
db83a39544
[data] update searx.data - update_external_bangs.py (#5061) 2025-07-29 07:25:26 +02:00
github-actions[bot]
bb3bea829d
[data] update searx.data - update_ahmia_blacklist.py (#5064) 2025-07-29 07:24:09 +02:00
github-actions[bot]
dc9ad0a493
[data] update searx.data - update_currencies.py (#5065) 2025-07-29 07:23:38 +02:00
github-actions[bot]
5db7b70dc7
[data] update searx.data - update_engine_descriptions.py (#5066) 2025-07-29 07:22:58 +02:00
github-actions[bot]
2ad35421d7
[data] update searx.data - update_firefox_version.py (#5063)
Co-authored-by: searxng-bot <searxng-bot@users.noreply.github.com>
2025-07-29 07:22:21 +02:00
Markus Heiser
f32e91e51a
[fix] duckduckgo engine: logger.error / missing argument (#5057)
The error message in case the vqd value could not be determined was incorrect
and triggered an exception::

     File "/usr/local/searxng/searxng-src/searx/engines/duckduckgo.py", line 132, in get_vqd
       logger.error("vqd value from duckduckgo.com ", resp.status_code)
     Message: 'vqd value from duckduckgo.com '
     Arguments: (202,)
2025-07-28 15:36:52 +02:00
Markus Heiser
17f2027c4f
[fix] NotImplementedError raised by ResultContainer (#5058)
If the ``on_result`` handle returns False, then the ``else`` was always jumped
to, which throws the NotImplementedError exception::

    File "/usr/local/searxng/searxng-src/searx/results.py", line 99, in extend
      raise NotImplementedError(f"no handler implemented to process the result of type {result}")
    NotImplementedError: no handler implemented to process the result of type MainResult(title=...
2025-07-28 15:36:26 +02:00
mggh0139
54a2b553f4
[fix] tracker pattern: let startup continue if url fetch fails (#5055)
Use Python exception to prevent startup crash in case of fetch ClearURL
failure. Also add some logs.

Closes: https://github.com/searxng/searxng/issues/5054
2025-07-28 07:03:01 +02:00
Fjara
f04c273732
[fix] correct comment in settings.yml for value to disable scheduling (#5052)
settings.yml correct value to disable scheduling
2025-07-27 17:36:39 +02:00
Bnyro
1baf3dcd1c
[fix] webapp.py: info (and other) page(s) don't load properly (#5051) 2025-07-26 17:58:53 +02:00
Markus Heiser
649a8dd577
[fix] cleanup: rename searx leftovers to SearXNG (#5049)
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-07-26 06:22:46 +02:00
SeriousConcept1134
02cbdf468b [fix] google video: refactor broken engine to work again
The current google_videos.py in the master branch is completely non functional, due to it not parsing the returned video search results correctly. The result is searxng saying that no results were found. This commit is a new updated google_videos.py that's designed to fix that and is confirmed to be working.

Implementing the suggestions by Bnyro.

Re-formatted with `black` for compatibility. After failing automated checks, ran the command:
black --line-length 120 --skip-string-normalization --target-version py311 google_videos.py
2025-07-25 21:40:53 +02:00
Markus Heiser
168fa9b09b
[mod] make run: start granian server and versioning by Dependabot (#5037)
The new ``requirements-server.txt`` (granian) is installed into the virtualenv
of Dockerfile.

When ``make run`` is called, a granian server is started with auto reload on
application's files changes / requires granian[reload] extra, see
``requirements-dev.txt``.

Dependabot supports updates to any ``.txt`` file [1].

[1] https://docs.github.com/en/code-security/dependabot/ecosystems-supported-by-dependabot/supported-ecosystems-and-repositories#pip-and-pip-compile

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-07-25 17:40:33 +02:00
github-actions[bot]
83adda8567
[l10n] update translations from Weblate (#5047)
19de4a735 - 2025-07-23 - eudemo <eudemo@noreply.codeberg.org>
4504f8600 - 2025-07-22 - IcewindX <icewindx@noreply.codeberg.org>
2b4ec6d2c - 2025-07-22 - lucasmz.dev <lucasmz.dev@noreply.codeberg.org>
69b3590de - 2025-07-22 - Fjuro <fjuro@alius.cz>
f48c7b9ac - 2025-07-21 - alexgabi <alexgabi@noreply.codeberg.org>
f8bc97254 - 2025-07-21 - alexgabi <alexgabi@noreply.codeberg.org>
7bd4b6441 - 2025-07-21 - Juno Takano <jutty@noreply.codeberg.org>
527bc690c - 2025-07-20 - 0ko <0ko@noreply.codeberg.org>
dd242f579 - 2025-07-20 - IcewindX <icewindx@noreply.codeberg.org>
f2a3cdb26 - 2025-07-19 - ledjfou <ledjfou@noreply.codeberg.org>
6781d5611 - 2025-07-20 - alexgabi <alexgabi@noreply.codeberg.org>
df82ea943 - 2025-07-18 - zbbhzdaajc <zbbhzdaajc@noreply.codeberg.org>
7892aac02 - 2025-07-18 - Priit Jõerüüt <jrtcdbrg@noreply.codeberg.org>
2025-07-25 12:20:58 +02:00
Markus Heiser
84c3a832a4
[fix] false is an invalid value for wiki_url in settings.yml (#5046)
Closes: https://github.com/searxng/searxng/issues/5045
2025-07-24 17:47:09 +02:00
Ivan Gabaldon
802bf4f9e7
[fix] py: absolute static path (#5043)
The path to static should be relative (If sxng is served under "/sxng", the static route passed to the client won't be "/sxng/static/..." as expected but "/static/...")

Closes https://github.com/searxng/searxng/issues/5042
2025-07-24 14:55:04 +02:00
Bnyro
6b16a04e7e
[mod] wordnik: convert to answerer (#4980)
Wordnik is now an answerer and not in the infobox anymore: it uses the
translations answerer, because it provides all the features needed. By default,
only its first results is shown

Additionally a new "define" category is added - I know, it's the same as the
"dictionaries" category, but I don't think we can alias categories.  This allows
to search e.g. for `!define tree`, the idea is to allow easy searches for
definitions of words.

Related:

- https://github.com/searxng/searxng/issues/4111
2025-07-24 07:42:31 +02:00
Ivan Gabaldon
b01d32d69d
[fix] py: restore application for uWSGI (#5040)
Was removed on https://github.com/searxng/searxng/pull/5032
2025-07-23 23:55:50 +02:00
Ivan Gabaldon
f7c8e4c353
[fix] py: overwrite version_frozen on explicit freeze (#5020)
Once version_frozen.py has been created, it will never be updated again unless the file is manually deleted.
2025-07-23 18:17:58 +02:00
Ivan Gabaldon
42f102ce1b
[enh] py: whitenoise for static handling (#5032)
While looking at ways to better handle static files, I saw a package that replaces Flask `static_folder` functionality. Not only it's considerably faster, but already includes the capability to serve sidecars without having to intercept. This also replaces the uWSGI folder mapping functionality.

Closes https://github.com/searxng/searxng/issues/4977
2025-07-23 18:16:10 +02:00
Bnyro
5cbf422621 [fix] tracker url remover + external bangs: use standard network config
Using plain `httpx` directly doesn't use SearXNG's additional network config, including proxies, http2 config, ...

Related issues:
- https://github.com/searxng/searxng/issues/5027
2025-07-22 10:25:33 +02:00
github-actions[bot]
be392a45fc
[l10n] update translations from Weblate (#5023)
fce853a65 - 2025-07-16 - return42 <return42@noreply.codeberg.org>
234a91155 - 2025-07-16 - return42 <return42@noreply.codeberg.org>
162ff0369 - 2025-07-16 - return42 <return42@noreply.codeberg.org>
3307e81ab - 2025-07-16 - return42 <return42@noreply.codeberg.org>
7948181fb - 2025-07-15 - Juno Takano <jutty@noreply.codeberg.org>
e88a0b264 - 2025-07-15 - muha7a <muha7a@noreply.codeberg.org>
7b37b944e - 2025-07-14 - Cookie_Monster <cookie_monster@noreply.codeberg.org>
d6c61f1ff - 2025-07-14 - kolegacik <kolegacik@noreply.codeberg.org>
5bd662542 - 2025-07-15 - lucasmz.dev <lucasmz.dev@noreply.codeberg.org>
4ddad097c - 2025-07-14 - yoonhahwang <yoonhahwang@noreply.codeberg.org>
a8d319c18 - 2025-07-13 - norizou <norizou@noreply.codeberg.org>
e7e471f65 - 2025-07-13 - Hēphaistos <hephaistos@noreply.codeberg.org>
b6b198f0a - 2025-07-12 - return42 <return42@noreply.codeberg.org>
9da60d355 - 2025-07-11 - sourdragon <sourdragon@noreply.codeberg.org>
632b879ba - 2025-07-12 - return42 <return42@noreply.codeberg.org>
a543b2b87 - 2025-07-12 - return42 <return42@noreply.codeberg.org>
7e418d9cc - 2025-07-12 - return42 <return42@noreply.codeberg.org>
6e78fbd5c - 2025-07-12 - return42 <return42@noreply.codeberg.org>
917b27bad - 2025-07-12 - return42 <return42@noreply.codeberg.org>
82e69afbf - 2025-07-12 - return42 <return42@noreply.codeberg.org>
096c36ef7 - 2025-07-12 - return42 <return42@noreply.codeberg.org>
2048ef8e2 - 2025-07-12 - return42 <return42@noreply.codeberg.org>

Co-authored-by: searxng-bot <searxng-bot@users.noreply.github.com>
2025-07-19 07:14:26 +02:00
Ivan Gabaldon
ff2e0ea278
[mod] py: don't append "-dirty" to DOCKER_TAG (#5021)
We don't expect tags to have "-dirty", just the GIT_VERSION regardless of how the container is built.
2025-07-18 10:42:44 +02:00
Markus Heiser
e851bc1269
[fix] calculator plugin: filtering real calculation tasks (#5016)
Whether the query is a real calculation tasks is currently only detected in the
AST, resulting in unnecessary creatins of subprocesses. This problem is
mitigated with this patch: if the query contains letters, it is obviously not a
math problem, and the plugin can return without further action.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-07-17 19:50:02 +02:00
Markus Heiser
62fac1c6a9 [fix] custom plugins: settings must not be merged.
In customizing it should be decided which plugin modules should be loaded and
which should not.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-07-16 20:55:39 +02:00
mrpaulblack
a0a2f0fd42 [build] /static 2025-07-16 11:02:12 +02:00
Markus Heiser
574b285efa
[mod] remove option ui.static_use_hash (cache busting) (#5004)
Cache busting has caused serious problems for users in the past, here are two
examples:

- https://github.com/searxng/searxng/issues/4419
- https://github.com/searxng/searxng/issues/4481

And it makes development and deployment significantly more complex because it
binds the client side to the server side:

- https://github.com/searxng/searxng/pull/4466

In the light of a decoupled development of the WEB clients from the server side:

- https://github.com/searxng/searxng/pull/4988

is it appropriate to abandon this feature. In fact,  it has been ineffective
since #4436 anyway.

However, the benefit has always been questionable, since at best only a few kB
of data are saved (at least in the context of an image_proxy, the effect is below
the detection limit). Ultimately, the client is responsible for caching.

Related: https://github.com/searxng/searxng/issues?q=label%3A%22clear%20browser%20cache%22

Closes: https://github.com/searxng/searxng/pull/4466
Closes: https://github.com/searxng/searxng/issues/1326
Closes: https://github.com/searxng/searxng/issues/964

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-07-11 16:53:36 +02:00
github-actions[bot]
b5ae3a047d
[l10n] update translations from Weblate (#4998)
6c74fe951 - 2025-07-08 - janindu-t <janindu-t@noreply.codeberg.org>
a17afd1fd - 2025-07-06 - ajiou <ajiou@noreply.codeberg.org>
6424a07ea - 2025-07-05 - aindriu80 <aindriu80@noreply.codeberg.org>
e62b0059e - 2025-07-05 - kratos <makesocialfoss32@keemail.me>

Co-authored-by: searxng-bot <searxng-bot@users.noreply.github.com>
2025-07-11 11:15:07 +02:00
Bnyro
a48ec8a4d5
[chore] engines: remove redundant usages of utils#gen_useragent (#4993)
These engines override the user agent manually using `gen_useragent`, although that's already done in the online preprocessor that runs before the actual `request(query, params)` method is called. Hence, this call is duplicated.

Related:
- https://github.com/searxng/searxng/pull/4990#discussion_r2195142838
2025-07-11 08:42:39 +02:00
Bnyro
4b9644eb27 [fix] public domain image archive: cloud provider changed angolia -> aws
- apparently, PDIA switched from Angolia to AWS :/
- we no longer require an API key, but the AWS node might change, so we still have to extract the API url of the node
- the response format is still the same, so no changes needed in that regard

- closes #4989
2025-07-10 15:12:26 +02:00