Commit Graph

145 Commits

Author SHA1 Message Date
Don-Swanson 65326e37b4 Refactor User Agent handling in request.py and ua_generator.py
- Removed hardcoded User Agent strings and replaced them with a fallback mechanism using DEFAULT_FALLBACK_UA.
- Updated gen_user_agent function to ensure compatibility with older configurations.
- Bumped version to 1.1.1 to reflect changes in User Agent management.
2025-11-23 22:16:53 -06:00
Don-Swanson 490fc6c4f9 UA ua_generator and generate_uas.py updates 2025-11-23 21:10:50 -06:00
Don-Swanson 9b3a6ce550 Update README and codebase to enhance User Agent handling
- Revised README to reflect changes in Google search behavior and Whoogle's response strategies.
- Implemented a User Agent pool for improved request handling, including fallback mechanisms.
- Added configuration options for displaying the User Agent in search results.
- Introduced a command-line tool for generating custom User Agent strings.
- Enhanced request headers to include additional parameters for better compatibility with Google services.
2025-11-23 20:35:08 -06:00
Don-Swanson 65c0c99dad Fix test_prefs_url 2025-10-01 22:21:17 -05:00
Don e0a4a5f2cb Merge pull request #1251 from rstefko/images-with-links
Re-enable view_image functionality
2025-09-30 21:09:11 -05:00
rstefko ca214cb563 Allow view_image on mobile too, to be able to see origin 2025-09-28 08:46:18 +02:00
rstefko 9dd33de91a Seems to be working again with new UA 2025-09-28 08:32:23 +02:00
Don-Swanson be83605c77 Update dependencies in requirements.txt and refactor file handling in app initialization and utility functions to use context managers for better resource management. Adjust filter logic to utilize 'string' instead of 'text' for BeautifulSoup queries, enhancing compatibility with future versions. 2025-09-23 22:14:41 -05:00
Don-Swanson ffdeeb5f44 Enhance autocomplete functionality by adding environment variable check to enable/disable it globally. Improve error handling in HTTP client for closed connections and add client recreation logic. Refactor link extraction to avoid details elements in search results. 2025-09-23 21:37:21 -05:00
Don-Swanson 7f80eb1e51 feat(beta): httpx migration, Tor/proxy refactor, JSON results, alt-link fixes, tests, optional static bundling, HTTP/2 env toggle, cleanup 2025-09-21 00:11:54 -05:00
Ben Busby 1339c49dc5 Temporarily disable full size image search
The reliance on non-mobile user agents breaks the full size image search
config option, leading to empty image search result pages
2025-01-29 10:50:27 -07:00
Althior 88e2dda151 Update ad filter keyword for French language (#1208) 2025-01-16 17:31:50 -07:00
Ben Busby a016a1bcf4 Use raw string for matching regex in results.py
Fixes #1144
2024-09-30 11:41:13 -06:00
Ben Busby 6924f5ce0d Update ad filter keyword for Dutch language
Fixes #1172
2024-09-30 11:33:50 -06:00
Ben Busby 6abe5511f4 Update ad filter keyword for Korean language
Fixes #1162
2024-09-30 11:32:29 -06:00
Andrew 58d54c6384 Update ad filter for Czech language (#1141) 2024-09-30 11:12:37 -06:00
Akopov 91112f1b7b Refactor utils/misc.py (#1165)
* Improving readability, removing unnecessary elements, etc.

* Minor changes to comment style and favicon response code

---------

Co-authored-by: Ben Busby <contact@benbusby.com>
2024-09-30 10:42:12 -06:00
Matt Burns a509169110 Add SO as a default site alternative (#1168)
SO -> farside.link/anonymousoverflow
2024-09-30 10:34:54 -06:00
David Shen f18bf07ac3 Fix feeling lucky (#1130)
* Fix feeling lucky, fall through to display results if doesn't work

* Allow lucky bang anywhere

* Update feeling lucky test
2024-04-19 12:40:06 -06:00
David Shen fd20135af0 Add support for custom bangs (#1132)
Add the possibility for user-defined bangs, stored in app/static/bangs. 

These are parsed in alphabetical order, with the DDG bangs parsed first.
2024-04-19 12:26:42 -06:00
Ben Busby 2395bb7a6a Remove version from DDG bangs url
Including the version portion of the URL now redirects to search results
for the name of the bang file, rather than returning the bang file
itself. Removing the version from the URL returns the correct bang file.
2024-03-06 09:35:48 -07:00
Ben Busby cdbe550737 Add env vars for hiding favicons and removing daily update check
- WHOOGLE_SHOW_FAVICONS: Default on, can be set to 0 to hide favicons
  and skip the request for fetching them
- WHOOGLE_UPDATE_CHECK: Default on, can be set to 0 to disable the
  daily check for new versions released on github

Closes #1098
Closes #1059
2023-12-20 11:28:00 -07:00
Gautam Korlam 9cc1004fb8 fix: correctly handle skip_prefix logic for site_alts (#1092)
Fixes #1091
2023-11-01 14:07:45 -06:00
Ben Busby 2950aa869b Redirect POST search -> enc GET request
This should fix the annoyance with browsers like Firefox not caching
POST request responses. By redirecting a POST search to be a GET request
instead (with an encrypted query string), the page can be cached and
successfully navigated back to after visiting a result.
2023-10-16 16:28:36 -06:00
Ben Busby 7bda165ca3 Fetch fallback site icons from DDG
DDG provides favicons using the url format
icons.duckduckgo.com/ip2/{site}.ico

This can be used to fetch favicons in the event that the default
"/favicon.ico" path does not work.
2023-10-11 17:26:12 -06:00
Ben Busby c2873190c9 Display audio controls, refactor site icon placement
Audio controls are now always shown by default (mostly found in searches
that contain word pronunciation guides).

Site icons were moved to the left side of the results.
2023-10-11 15:41:48 -06:00
MoistCat 693ca3a9a8 Fix invalid calculator widget path (#1064)
When starting whoogle from another directory, the path to the calculator
widget was previously invalid. It now specifies the path relative to the widget
loader file.
2023-09-13 14:13:21 -06:00
Ben Busby a35b1dabbc Use filtered query for map tab
The map tab should only ever pass the raw query (i.e. no "-site:..."
strings), otherwise the maps page will return an error.

Fixes #1048
2023-09-08 16:44:04 -06:00
Ben Busby a623210244 Match exact words to trigger calculator widget
The calculator was previously triggered for partial matches with words
like "calc", which meant searches containing the word "calcium" would
cause the calculator widget to appear.
2023-09-08 16:19:39 -06:00
Ben Busby f65529f328 Allow defining custom redirects with WHOOGLE_REDIRECTS
Redirects to alternative frontends can now be defined using the
WHOOGLE_REDIRECTS environment variable. Usage is documented in the
readme, but is basically defined as <parent>:<new>.

Closes #988
2023-05-19 12:15:15 -06:00
Roman Štefko 2cb4b9e3ca Allow setting mobile/desktop UAs using env vars (#1003)
Defines separate environment variables for setting mobile vs desktop user
agents

Defines an environment variable for using the client's User-Agent

Co-authored-by: Ben Busby <contact@benbusby.com>
2023-05-19 11:32:05 -06:00
Ben Busby b39ba0533a Suppress spurious warnings from bs4
More MarkupResemblesLocatorWarning warnings have been appearing. This
seems to be caused by parsing HTML content that contains a URL.

This new change suppresses the warning at the root level of the app
before any content has been parsed, so this error shouldn't appear
again.

Fixes #968
2023-03-22 12:29:05 -06:00
Ben Busby 8c426ab180 Suppress invalid warning from bs4, add 404 handler
An invalid parsing warning was being thrown by the latest version of the
bs4 library. This suppresses that warning from being shown in the
console.

A 404 handler was added to move logging from the console to the error
template, since a lot of users assumed that 404 errors from the result
page were problems with Whoogle itself.

Fixes #967
2023-03-07 11:28:55 -07:00
João baa8bd0eb4 Add auth to cookie (#964)
When authenticated, the cookie set will allow the user to stay connected even
if the browser is restarted.

Fixes #951
2023-03-01 09:58:59 -07:00
Ben Busby 6b56dab4c1 Remove ig->bibliogram redirects
Bibliogram has been discontinued, and the remaining instances aren't
very reliable. As a result, all instagram redirects have been removed.

Fixes #955
2023-02-21 09:42:42 -07:00
elliot 7ca69e752d Add calculator widget (#956)
This adds a simple calculator widget, somewhat similar to the one presented
when searching calculator on Google.

Also, it adds somewhat of a template for making the addition of new widgets
easier via the app/utils/widgets.py file. My eventual plan is to use this to
create more widgets that appear in Google, such as a color picker, timer, etc.

---------

Co-authored-by: Ben Busby <contact@benbusby.com>
2023-02-21 09:36:38 -07:00
Ben Busby 991fe6d910 Exclude subdomain in Medium->Scribe redirects
Medium redirects needed further cleanup to account for instances where a
link contains a subdomain that would not make sense in a Farside
redirect link.

Fixes #947
2023-02-04 16:36:16 -07:00
Ben Busby 12ce174b9a Include url prefix for reverse proxied instances
The url prefix was not included when reconstructing the root url using
X-Forwarded-* headers, causing some elements to fail to load properly.

Fixes #937
2023-01-30 12:13:46 -07:00
Ahmad Alkadri e5a5aad997 Always bold CN/JA/KO search terms (#928)
Add a function to check if target_word contains CJK characters

If a search term contains Chinese, Japanese, or Korean characters,
the term is bolded in search results regardless of whitespace.

CJK characters: Chinese, Japanese (hiragana, katakana, kanji), 
and Korean (hangul syllables, hangul jamo)

Co-authored-by: Ben Busby <contact@benbusby.com>
2023-01-09 12:54:41 -07:00
Charles Zawacki cec10e81d3 Don't prepend to services that have schemes with '//' (#925) 2023-01-04 10:10:32 -07:00
Charles Zawacki a760476d1b Omit 'mobile.' and 'm.' in site alt replacements (#922)
Resolves #921
2023-01-03 10:19:39 -07:00
Ben Busby c24caceb03 Omit "www." in site alt replacements
Fixes #913
2022-12-29 16:16:29 -07:00
Ahmad Alkadri 3dda8b25ef Escape html text in result body (#912)
Moved the cleaner functions to app/utils/escaper.py

Removed unused import 're'

Moved the cleaner functionalities to the "search.py" and "routes.py"

Making sure escaped chars stay escaped during process

Replaced "&lt;" and "&gt;" with "andlt;" and "andgt;", respectively. This way,
when the 'response' object get loaded to bsoup (which happens several times
throughout the process between search.py and routes.py), bsoup will not
unescape them.
2022-12-29 15:19:28 -07:00
Ben Busby 3dc6d14377 Only extract domain+ext when using site alts
Parent sites using a 'www' subdomain or something similar were not
redirecting properly. This updates the hostname check to only validate
against the primary domain, except for Wikipedia since the subdomain is
used for interface translation in that case.

Fixes #901
2022-12-08 10:54:21 -07:00
Ben Busby fd85f1573a Refactor site alt link replacement
Replacing result links and text when site alts are enabled is now part
of its own function, and handles replacement of link location and link
description separately.

Fixes #880
2022-12-05 13:28:29 -07:00
Ben Busby 0310f0f542 Use app init enc key by default for all queries
This can be updated later to allow users with cookies enabled to use a
key that is unique to their session (if they want, not mandatory), but
for now it makes more sense to just use a single key for all queries
from all users. This should eliminate a lot of issues that users have
reported where they are unable to decrypt queries or page elements due
to an expired/renewed session key.
2022-12-05 12:14:14 -07:00
Ben Busby 3bd785b9b7 Update sponsored result filter for german results
Adds 'gesponsert' to ad keyword blacklist

Fixes #892
2022-11-28 10:18:10 -07:00
Ben Busby 09a90ec46a Match only "//medium" and ".medium.com" for scribe links
Closes #885
2022-11-22 17:34:25 -07:00
Xabi 6bd48e40a7 Include new ad filter keyword (#879)
Adds "sponsored" result keyword for Spanish language
2022-11-07 20:50:27 -07:00
Ben Busby 06fd29f663 Update ad filter keywords
New changes to google search now include ads prefixed with the keyword
"sponsored". This update should remove these from appearing in search
results.

Fixes #871
2022-10-31 13:02:20 -06:00