The time for the bootload is measured and recorded. To have an eye on the
bootload time is motivated by: In [1] it can be seen that the bootload of the
CURRENCIES cache takes about 2/3 of the time required to initialize SearXNG::
6.068 <module> searx/webapp.py:1
└─ 6.068 wrapper pyinstrument/context_manager.py:52
├─ 5.538 init searx/webapp.py:1373
│ ├─ 4.822 initialize searx/search/__init__.py:34
│ │ ├─ 4.631 ProcessorMap.init searx/search/processors/__init__.py:47
│ │ │ ├─ 4.607 OnlineCurrencyProcessor.initialize searx/search/processors/online_currency.py:55
│ │ │ │ └─ 4.607 CurrenciesDB.init searx/data/currencies.py:25
│ │ │ │ ├─ 4.601 CurrenciesDB.load searx/data/currencies.py:34
│ │ │ │ │ ├─ 4.572 ExpireCacheSQLite.set searx/cache.py:334
In the example, the CurrenciesDB.init call takes 4.6 seconds... on my laptop,
CurrenciesDB.init takes only 0.7 seconds. The absolute numerical values depend
on external conditions, where I already find 4-5 seconds very long. Test::
$ rm /tmp/sxng_cache_*.db*
$ make run 2>&1 | grep "searx.data.CURRENCIES"
DEBUG searx.data : init searx.data.CURRENCIES
DEBUG searx.data : init searx.data.CURRENCIES added 9089 items in 0.7623255252838135 sec.
[1] https://github.com/searxng/searxng/issues/5223#issuecomment-3323083411
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
* [fix] CI task "update_engine_traits.py" fails
To catch all problems with an HTTP request, the more general class
``httpx.HTTPError`` must be caught, for your test use::
$ ./manage dev.env
$ python ./searxng_extra/update/update_engine_traits.py
Closes: https://github.com/searxng/searxng/issues/5068
* [data] update searx.data - update_engine_traits.py
Depending on the respective runtime behavior, it could happen that the initial
loading of the DB tables in the cache was performed multiple times and in
parallel. The concurrent accesses then led to the `sqlite3.OperationalError:
database is locked` exception as in #4951.
Since this problem depends significantly on the runtimes (e.g., how long it
takes to retrieve the content for a table), this error could not be observed in
all installations.
Closes: https://github.com/searxng/searxng/issues/4951
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
On demand, the tracker data is loaded directly into the cache, so that the
maintenance of this data via PRs is no longer necessary.
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
In the previous implementation, all databases were loaded into memory when
importing the searx.data package, regardless of whether they were ever needed.
Regardless of this, it is an antipattern to load entire databases into memory
when importing a package or module; databases should be loaded when needed.
Lazy loading is a first step toward improving memory usage and also improves
performance when setting up the runtime environment. Building on this,
subsequent PRs will be able to further optimize memory behavior, e.g., by using
a real database application such as the one already available via
searx.cache.ExpireCache
Related:
- https://github.com/searxng/searxng/discussions/1892
- https://github.com/searxng/searxng/pull/3458
- https://github.com/searxng/searxng/pull/4650
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>