5519 Commits

Author SHA1 Message Date
Markus Heiser
230215c250
[fix] preferences: description not localized for all UI languages (#4844)
The previous implementation for determining the description of an engine did not
take into account that the UI languages ​​can also have a region tag and/or a
script tag:

    el-GR:      Ελληνικά, Ελλάδα (Greek, Greece)
    fa-IR:      فارسی, ایران (Persian, Iran)
    nb-NO:      Norsk bokmål, Norge (Norwegian bokmål, Norway)
    nl-BE:      Nederlands, België (Dutch, Belgium)
    pt-BR:      Português, Brasil (Portuguese, Brazil)
    zh-HK:      中文, 中國香港特別行政區 (Chinese, Hong Kong SAR China)
    zh-Hans-CN: 中文, 中国 (Chinese, China)
    zh-Hant-TW: 中文, 台灣 (Chinese, Taiwan)

Closes: https://github.com/searxng/searxng/issues/4842

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-05-23 17:11:41 +02:00
Markus Heiser
1ef5c03962
[fix] ddg engine: IndexError exception is raised on empty contend (#4843)
Sometimes (e.g. when ddg does not have a result item) there is no content and
the engine will fail with an IndexError:

  * Error: IndexError
  * Percentage: 10
  * Parameters: `()`
  * File name: `searx/engines/duckduckgo.py:375`
  * Function: `response`
  * Code: `item["content"] = extract_text(eval_xpath(div_result, './/a[contains(@class, "result__snippet")]')[0])`

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-05-23 14:55:22 +02:00
useralias
4fa7de8033
[refactor] duckduckgo engine: improve request logic and code structure (#4837)
Changes:
- Add trailing slash to base URL to prevent potential redirects
- Remove advanced search syntax filtering (no longer guarantees a CAPTCHA)
- Correct pagination offset calculation: Page 2 now starts at offset 10,
  subsequent pages use 10 + (n-2)*15 formula instead of the previous
  broken 20 + (n-2)*50 calculation that caused CAPTCHAs
- Restructure request parameter building to better match a real request
- "kt" cookie is no longer an empty string if the language/region is "all"
- Group related parameter assignments together
- Add header logging to debugging output

Related:

- https://github.com/searxng/searxng/issues/4824
2025-05-23 13:01:10 +02:00
Markus Heiser
98badc9cd0
[fix] searx.data: fetch-traits - z-library (httpx.ConnectError) (#4835)
There is currently no known z-library, and all known URLs are dead [1]. To avoid
interrupting automated updates, a connection error to a z-library is treated as
a *known error*, and the old properties of the z-library are retained.

[1] https://github.com/searxng/searxng/issues/3610

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-05-22 17:43:11 +02:00
Markus Heiser
d29cf64ce4
[mod] searx.data: lazy load of the data objects (databases) (#4834)
In the previous implementation, all databases were loaded into memory when
importing the searx.data package, regardless of whether they were ever needed.

Regardless of this, it is an antipattern to load entire databases into memory
when importing a package or module; databases should be loaded when needed.

Lazy loading is a first step toward improving memory usage and also improves
performance when setting up the runtime environment.  Building on this,
subsequent PRs will be able to further optimize memory behavior, e.g., by using
a real database application such as the one already available via

    searx.cache.ExpireCache

Related:

- https://github.com/searxng/searxng/discussions/1892
- https://github.com/searxng/searxng/pull/3458
- https://github.com/searxng/searxng/pull/4650

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-05-22 16:23:30 +02:00
Markus Heiser
861f9c4be5 [build] /static 2025-05-22 15:16:54 +02:00
Bnyro
32823ecb69 [refactor] search.js: use custom auto completion implementation
The previously used library is unmaintained for 6 years now [1] and the solution
had know issues [2][3]

[1] https://github.com/searxng/searxng/pull/4284#discussion_r1954493434
[2] https://github.com/searxng/searxng/pull/4318#issuecomment-2731576657
[3] https://github.com/privau/searxng/issues/56
2025-05-22 15:16:54 +02:00
Zhijie He
156d1eb8c8
[feat] engines: add Naver engine (#4573)
Refactor Naver engine (Web, News, Images, Videos, Autocomplete)

- ref: https://search.naver.com/
- lang: `ko`
- Wikidata: https://www.wikidata.org/wiki/Q485639

Co-authored-by: Bnyro <bnyro@tutanota.com>
2025-05-21 18:25:02 +02:00
Markus Heiser
365b9426f1
[fix] engines: disable those with known issues (#4813)
- z-library https://github.com/searxng/searxng/issues/3610
- library of congress: https://github.com/searxng/searxng/issues/4810
- qwant: https://github.com/searxng/searxng/issues/3929

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-05-21 15:50:29 +02:00
Bnyro
502017b901
[fix] pinterest: engine broken due to API changes (#4816)
- apparently the API now requires a `X-Pinterest-PWS-Handler` in order to
  properly function (extracted from their web UI)

- the other `X-Pinterest` headers here are added in case they become mandatory
  too

Closes: https://github.com/searxng/searxng/issues/4812
2025-05-21 15:22:42 +02:00
Bnyro
88973f5431
[feat] engines: add uxwing engine for icons (#4819)
- uxwing provides attribution-free icons to use for design projects
- svgrepo was my go-to before, but it's ratelimiting a lot recently
2025-05-21 15:10:29 +02:00
Bnyro
8bff73c9b6
[refactor] icon engines: add new icon category (#4817)
Icons category makes sense because it allows to quickly search for free SVG
icons to use for websites / other designs with a quick `!icons` query

Icons don't seem to fit into the normal images category that well because icons
are quite a special type of images
2025-05-21 14:52:16 +02:00
Jost Alemann
7420706a50
[chore] fix some docstring typos (#4815) 2025-05-20 21:03:54 +02:00
useralias
6ec554cb5b [fix] yahoo: url and title xpath 2025-05-20 21:02:40 +02:00
Alexandre Flament
7a3742ae56
[mod] upgrade to httpx 0.28.1 (#4674) 2025-05-20 18:18:07 +02:00
Markus Heiser
ca67f1dffe
[fix] duckduckgo engines: issue when get_vqd() is used by ddg-images and ddg-videos (#4809)
The global variable CACHE is not initialized when DDG images or DDG videos
import the get_vqd() function (please remember: the engine modules are imported
using the importlib method and not via the `import` keyword).

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-05-20 16:10:10 +02:00
Markus Heiser
b8b857d24c
[mod] engine invidious: commented out / no public API available nowadays (#4800)
Reported-by: @unifox https://github.com/searxng/searxng/issues/2722#issuecomment-2884993248

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-05-18 13:20:27 +02:00
github-actions[bot]
1b08324f26
[l10n] update translations from Weblate (#4788) 2025-05-16 09:40:45 +02:00
Ivan Gabaldon
2cfd3fc44b
[enh] tidy: clean old morty, filtron, searx references
Everyone should have already switched from legacy methods
2025-05-13 10:37:02 +02:00
Markus Heiser
9006866019
[fix] engine archlinux: avoid Anubis challenge by User-Agent "SearXNG" (#4779)
Of the archlinux wikis only wiki.archlinux.org has a has Anubis challenge.

About Anubis[1]:

> Anubis decides to present a challenge using this logic:
>
> - User-Agent contains "Mozilla"
> ...
> This should ensure that git clients, RSS readers, and other low-harm clients
> can get through without issue ..

[1] 6c0ff3f4d5/docs/docs/design/how-anubis-works.mdx (challenge-presentation)


Suggested-by: @unixfox https://github.com/searxng/searxng/issues/4646#issuecomment-2855322406
Closes: https://github.com/searxng/searxng/issues/4646

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-05-13 10:18:28 +02:00
Ivan Gabaldon
d16854e67a
[mod] rework container deployment (#4764)
container.yml will run after integration.yml COMPLETES successfully and in master branch.

Style changes, cleanup and improved integration with CI by leveraging the use of
shared cache between all workflows.

* Podman is now supported to build the container images (Docker also received a refactor, merging both build and buildx)
* Container images are being built by Buildah instead of Docker BuildKit.
* Container images are tested before release.
* Splitting "modern" (amd64 & arm64) and "legacy" (armv7) arches on different Dockerfiles allowing future optimizations.
2025-05-11 18:12:51 +02:00
Markus Heiser
ef158ce1f4 [build] /static 2025-05-09 12:40:34 +02:00
SearXNG Bot
76ebad0b21
[l10n] update translations from Weblate (#4744)
6f8c520f2 - 2025-05-08 - polskiecus <polskiecus@noreply.codeberg.org>
05dd91d5b - 2025-05-08 - return42 <return42@noreply.codeberg.org>
686b8e5fb - 2025-05-08 - return42 <return42@noreply.codeberg.org>
f40b42bd8 - 2025-05-05 - ehsanrs2 <ehsanrs2@noreply.codeberg.org>
b8013bc99 - 2025-05-03 - polskiecus <polskiecus@noreply.codeberg.org>
5affaa104 - 2025-05-02 - SomeTr <sometr@noreply.codeberg.org>

Co-authored-by: searxng-bot <searxng-bot@users.noreply.github.com>
2025-05-09 09:31:50 +02:00
github-actions[bot]
d76f030cb3
[data] update searx.data - update_wikidata_units.py (#4738) 2025-05-09 07:09:58 +02:00
github-actions[bot]
b3b15ecc72
[data] update searx.data - update_ahmia_blacklist.py (#4739)
Co-authored-by: inetol <inetol@users.noreply.github.com>
2025-05-09 07:09:00 +02:00
github-actions[bot]
1319b250af
[data] update searx.data - update_currencies.py (#4740)
Co-authored-by: inetol <inetol@users.noreply.github.com>
2025-05-09 07:08:26 +02:00
github-actions[bot]
198928de05
[data] update searx.data - update_engine_traits.py (#4741)
Co-authored-by: inetol <inetol@users.noreply.github.com>
2025-05-09 07:07:33 +02:00
github-actions[bot]
11d9c830b8
[data] update searx.data - update_engine_descriptions.py (#4742)
Co-authored-by: inetol <inetol@users.noreply.github.com>
2025-05-09 07:06:52 +02:00
benpiano800
bc06b1aece
[enh] plugins: tor_check: Add more keywords (#4726)
Previously, there was only one usable keyword for the tor_check plugin. Adding more keywords eliminates confusion.
2025-05-07 10:39:46 +02:00
Brock Vojkovic
ff60fe635f
[fix] sec-fetch-* blocking infinite scroll (#4728) 2025-05-07 10:38:21 +02:00
Markus Heiser
6e7119fa4e
[fix] references from searx.botdetection.http_sec_fetch (#4723) 2025-05-07 10:25:47 +02:00
Émilien (perso)
19b116f1d7
fix: check if the browser supports Sec-Fetch headers (#4696) 2025-05-04 10:12:25 +02:00
Markus Heiser
fe08bb1d90 [mod] botdetection: HTTP Fetch Metadata Request Headers
HTTP Fetch Metadata Request Headers [1][2] are used to detect bot requests. Bots
with invalid *Fetch Metadata* will be redirected to the intro (`index`)  page.

[1] https://www.w3.org/TR/fetch-metadata/
[2] https://developer.mozilla.org/en-US/docs/Glossary/Fetch_metadata_request_header

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-05-04 02:07:26 +02:00
Markus Heiser
8ef5fbca4e [fix] cache.ExpireCache: definition of a context name for the key
The definition of a context name belongs in the abstract base class (was
previously only in the concrete implementation for the SQLite adapter).

Suggested-by: @dalf https://github.com/searxng/searxng/pull/4650#discussion_r2069873853
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-05-03 08:39:12 +02:00
Markus Heiser
7351c38e6c [fix] (armv7) cache.ExpireCache: remove option ENCRYPT_VALUE
Prophylactic encryption of the value currently makes no sense; on the contrary,
since the ``cryptography`` package is not available on armv7, it would cause
further problems.

Suggested-by: @dalf https://github.com/searxng/searxng/pull/4650#issuecomment-2830786661
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-05-03 08:39:12 +02:00
Markus Heiser
bdfe1c2a15 [mod] engines: migration of the individual cache solutions to EngineCache
The EngineCache class replaces all previously individual solutions for caches in
the context of the engines.

- demo_offline.py
- duckduckgo.py
- radio_browser.py
- soundcloud.py
- startpage.py
- wolframalpha_api.py
- wolframalpha_noapi.py

Search term to test most of the modified engines::

    !ddg !rb !sc !sp !wa test

    !ddg !rb !sc !sp !wa foo

For introspection of the DB, jump into developer environment and run command to
show cache state::

    $ ./manage pyenv.cmd bash --norc --noprofile
    (py3) python -m searx.enginelib cache state

    cache tables and key/values
    ===========================
    [demo_offline        ] 2025-04-22 11:32:50 count        --> (int) 4
    [startpage           ] 2025-04-22 12:32:30 SC_CODE      --> (str) fSOBnhEMlDfE20
    [duckduckgo          ] 2025-04-22 12:32:31 4dff493e.... --> (str) 4-128634958369380006627592672385352473325
    [duckduckgo          ] 2025-04-22 12:40:06 3e2583e2.... --> (str) 4-263126175288871260472289814259666848451
    [radio_browser       ] 2025-04-23 11:33:08 servers      --> (list) ['https://de2.api.radio-browser.info',  ...]
    [soundcloud          ] 2025-04-29 11:40:06 guest_client_id --> (str) EjkRJG0BLNEZquRiPZYdNtJdyGtTuHdp
    [wolframalpha        ] 2025-04-22 12:40:06 code         --> (str) 5aa79f86205ad26188e0e26e28fb7ae7
    number of tables: 6
    number of key/value pairs: 7

In the "cache tables and key/values" section, the table name (engine name) is at
first position on the second there is the calculated expire date and on the
third and fourth position the key/value is shown.

About duckduckgo: The *vqd coode* of ddg depends on the query term and therefore
the key is a hash value of the query term (to not to store the raw query term).

In the "properties of ENGINES_CACHE" section all properties of the SQLiteAppl /
ExpireCache and their last modification date are shown::

    properties of ENGINES_CACHE
    ===========================
    [last modified: 2025-04-22 11:32:27] DB_SCHEMA           : 1
    [last modified: 2025-04-22 11:32:27] LAST_MAINTENANCE    :
    [last modified: 2025-04-22 11:32:27] crypt_hash          : ca612e3566fdfd7cf7efe2b1c9349f461158d07cb78a3750e5c5be686aa8ebdc
    [last modified: 2025-04-22 11:32:30] CACHE-TABLE--demo_offline: demo_offline
    [last modified: 2025-04-22 11:32:30] CACHE-TABLE--startpage: startpage
    [last modified: 2025-04-22 11:32:31] CACHE-TABLE--duckduckgo: duckduckgo
    [last modified: 2025-04-22 11:33:08] CACHE-TABLE--radio_browser: radio_browser
    [last modified: 2025-04-22 11:40:06] CACHE-TABLE--soundcloud: soundcloud
    [last modified: 2025-04-22 11:40:06] CACHE-TABLE--wolframalpha: wolframalpha

These properties provide information about the state of the ExpireCache and
control the behavior.  For example, the maintenance intervals are controlled by
the last modification date of the LAST_MAINTENANCE property and the hash value
of the password can be used to detect whether the password has been changed (in
this case the DB entries can no longer be decrypted and the entire cache must be
discarded).

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-05-03 08:39:12 +02:00
Markus Heiser
4cbfba9d7b [mod] ExpireCache - sqlite based key/value cache with expire time 2025-05-03 08:39:12 +02:00
Markus Heiser
4a594f1b53 [fix] ResourceWarning: unclosed database in sqlite3
Reported:

- https://github.com/inetol-infrastructure/searxng-container/issues/5

Related:

- https://github.com/searxng/searxng/issues/4405#issuecomment-2692352352

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-05-03 08:39:12 +02:00
Bnyro
590b211652 [fix] semantic scholar: method not allowed / engine doesn't work
Fixes the semantic scholar engine by extracting a ui version token.

BTW: remove html tags from the content.

Author's checklist:

- they are ratelimiting very fast, if you do approx more than 2 requests per
  minute, you have to wait some time again...

- they also have an official api at api.semanticscholar.org, but it's ratelimits
  are even harder

Closes: https://github.com/searxng/searxng/issues/4685
2025-05-02 16:46:38 +02:00
return42
41e3a0baa7 [data] update searx.data - update_engine_descriptions.py 2025-05-02 16:45:47 +02:00
BrandonStudio
d47cf9db24 [feat] engine ChinaSo: support source filter for ChinaSo-News
* filtering ChinaSo-News results by source, option ``chinaso_news_source``
* add ChinaSo engine to the online docs https://docs.searxng.org/dev/engines/online/chinaso.html
* fix SearXNG categories in the settings.yml
* deactivate ChinaSo engines ``inactive: true`` until [1] is fixed
* configure network of the ChinaSo engines

[1] https://github.com/searxng/searxng/issues/4694

Signed-off-by: @BrandonStudio
Co-authored-by: Markus Heiser <markus.heiser@darmarit.de>
2025-05-02 14:22:51 +02:00
searxng-bot
abcf8ca073 [l10n] update translations from Weblate
244e1f3f2 - 2025-04-30 - return42 <return42@noreply.codeberg.org>
2025-05-02 13:29:05 +02:00
Bnyro
fd33559cfb [fix] brave: fix images and videos engines 2025-04-30 08:28:04 +02:00
Denperidge
60e31eacfc [fix] pdia: dynamically fetch API key config file location
As suggested by @Bnyro at
https://github.com/searxng/searxng/pull/4652#discussion_r2055760390 !
2025-04-29 20:45:08 +02:00
return42
c45b970546 [data] update searx.data - update_currencies.py 2025-04-29 09:12:06 +02:00
Markus Heiser
a4251be397 [data] update searx.data - make data.traits
Related:

- https://github.com/searxng/searxng/pull/4687

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-04-29 09:11:33 +02:00
Markus Heiser
c20038e7c3 [fix] engine yahoo: replace fetch_traits by a list of languages
The Yahoo engine's fetch_traits function has been encountering an error in CI
jobs for several months [1], thus aborting the process for all other engines as
well.

The language selection dialog (which fetch_traits calls) requires an `EuConsent`
cookie. Strangely, the cookie is not needed for searching, which is why the
engine itself still works.

Since Yahoo won't be conquering any new marketplaces in the foreseeable future,
it should be sufficient to hard-implement the list of currently available
languages ​​(`yahoo_languages`).

[1] https://github.com/searxng/searxng/actions/runs/14720458830/job/41313149268

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-04-29 08:48:56 +02:00
return42
20e40ded6d [data] update searx.data - update_wikidata_units.py 2025-04-29 07:07:15 +02:00
return42
4de0766b76 [data] update searx.data - update_ahmia_blacklist.py 2025-04-29 07:06:49 +02:00
return42
6de83ca47f [data] update searx.data - update_firefox_version.py 2025-04-29 07:06:26 +02:00