This has been a regression introduced with the removal of
the unmaintained autocomplete.js library.
We should only focus the search bar on the main search page at `/`
and not at the results page located at `/search`.
I'm not sure if there's a better way to figure out if
we're on the results page than checking if the id of the
main element is `#main_results`, checking the path
obviously isn't a better solution because it can differ
depending on the instance / reverse proxy / ....
- related to 32823ecb69
- closes https://github.com/searxng/searxng/issues/4846
The previous implementation for determining the description of an engine did not
take into account that the UI languages can also have a region tag and/or a
script tag:
el-GR: Ελληνικά, Ελλάδα (Greek, Greece)
fa-IR: فارسی, ایران (Persian, Iran)
nb-NO: Norsk bokmål, Norge (Norwegian bokmål, Norway)
nl-BE: Nederlands, België (Dutch, Belgium)
pt-BR: Português, Brasil (Portuguese, Brazil)
zh-HK: 中文, 中國香港特別行政區 (Chinese, Hong Kong SAR China)
zh-Hans-CN: 中文, 中国 (Chinese, China)
zh-Hant-TW: 中文, 台灣 (Chinese, Taiwan)
Closes: https://github.com/searxng/searxng/issues/4842
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Sometimes (e.g. when ddg does not have a result item) there is no content and
the engine will fail with an IndexError:
* Error: IndexError
* Percentage: 10
* Parameters: `()`
* File name: `searx/engines/duckduckgo.py:375`
* Function: `response`
* Code: `item["content"] = extract_text(eval_xpath(div_result, './/a[contains(@class, "result__snippet")]')[0])`
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Changes:
- Add trailing slash to base URL to prevent potential redirects
- Remove advanced search syntax filtering (no longer guarantees a CAPTCHA)
- Correct pagination offset calculation: Page 2 now starts at offset 10,
subsequent pages use 10 + (n-2)*15 formula instead of the previous
broken 20 + (n-2)*50 calculation that caused CAPTCHAs
- Restructure request parameter building to better match a real request
- "kt" cookie is no longer an empty string if the language/region is "all"
- Group related parameter assignments together
- Add header logging to debugging output
Related:
- https://github.com/searxng/searxng/issues/4824
There is currently no known z-library, and all known URLs are dead [1]. To avoid
interrupting automated updates, a connection error to a z-library is treated as
a *known error*, and the old properties of the z-library are retained.
[1] https://github.com/searxng/searxng/issues/3610
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
In the previous implementation, all databases were loaded into memory when
importing the searx.data package, regardless of whether they were ever needed.
Regardless of this, it is an antipattern to load entire databases into memory
when importing a package or module; databases should be loaded when needed.
Lazy loading is a first step toward improving memory usage and also improves
performance when setting up the runtime environment. Building on this,
subsequent PRs will be able to further optimize memory behavior, e.g., by using
a real database application such as the one already available via
searx.cache.ExpireCache
Related:
- https://github.com/searxng/searxng/discussions/1892
- https://github.com/searxng/searxng/pull/3458
- https://github.com/searxng/searxng/pull/4650
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Temporarily remove the -e flag from set to prevent entrypoint.sh from stopping execution if any command returns a non-zero status. This doesn't solve anything but relaxes the script checks.
Related https://github.com/searxng/searxng/issues/4818
- apparently the API now requires a `X-Pinterest-PWS-Handler` in order to
properly function (extracted from their web UI)
- the other `X-Pinterest` headers here are added in case they become mandatory
too
Closes: https://github.com/searxng/searxng/issues/4812
Icons category makes sense because it allows to quickly search for free SVG
icons to use for websites / other designs with a quick `!icons` query
Icons don't seem to fit into the normal images category that well because icons
are quite a special type of images
The global variable CACHE is not initialized when DDG images or DDG videos
import the get_vqd() function (please remember: the engine modules are imported
using the importlib method and not via the `import` keyword).
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
That entrypoint is prone to screw things up, especially with permission handling. The new script handles initialization better and fixes some issues like delayed settings update via ENVs and timestamp overwriting, also adjusts what should be copied into the container.
Related https://github.com/searxng/searxng/pull/4721#issuecomment-2850272129
The wolfi-base metapackage includes busybox, ca-certificates-bundle and the package manager. The change is to make the use of base-builder image more flexible.
Instead of using Wolfi base images from cgr.dev and making that mess on the Dockerfile, why don't we build the base images ourselves from Wolfi repos with apko? The intention of this is to simplify the main Dockerfile and avoid having to patch the base image every time, it also simplifies some steps like image ownership management and provides extremely fast builds.
Wolfi OS images are specifically designed for container use. Using a specially designed base image for containers not only reduces maintenance burdens, but improves overall experience for developers (fewer packages we have to track) and end users (smaller images).
Discussion here: https://github.com/searxng/searxng/issues/4753
If the workflow is executed with the "workflow_dispatch" trigger, the user who executed the workflow becomes the author of the commit on the PR, this is not intended.
It also reverts the body param so that the default text of the action does not appear.
`checker.yml` and `integration.yml` are the only workflows that are currently safe to be executed simultaneously, the others present a risk that the order of completion may not be expected. The ones that are chained from `integration.yml` can be called as many times as `integration.yml` workflows are running at that moment, the same with the trigger "workflow_dispatch".
This can be fatal for workflows like `container.yml` that use a centralized cache to store and load the candidate images in a common tag called "searxng-<arch>".
* For example, a `container.yml` workflow is executed after being chained from `integration.yml` (called "~1"), and seconds later it may be triggered again because another PR merged some breaking changes (called "~2"). While "~1" has already passed the test job successfully and is about to start the release job, "~2" finishes building the container and overwrites the references on the common tag. When "~1" in the release job loads the images using the common tag, it will load the container of "~2" instead of "~1" having skipped the whole test job process.
The example is only set for the container workflow, but the other workflows might occur in a similar way.
Currently, we have 1100~ cache images uploaded to GHCR that weigh more than 300 MB each (most of them are layers from the second phase of the Dockerfile that were uploaded by mistake, read below). To avoid problems, I have set up a new job in a new workflow to be run weekly purging all images older than 1 week, but leaving always the 100 most recent ones.
Only the builder images should be uploaded to cache, the actual behaviour not only slows down the time for building the container, but also wastes lots of space by saving large and useless layers to GHCR that will never be used again.