Merge branch 'benbusby:main' into main
This commit is contained in:
commit
2378500d2f
14
README.md
14
README.md
|
@ -85,7 +85,7 @@ Provides:
|
||||||
- Free HTTPS url (https://\<your app name\>.herokuapp.com)
|
- Free HTTPS url (https://\<your app name\>.herokuapp.com)
|
||||||
- Downtime after periods of inactivity \([solution](https://github.com/benbusby/whoogle-search#prevent-downtime-heroku-only)\)
|
- Downtime after periods of inactivity \([solution](https://github.com/benbusby/whoogle-search#prevent-downtime-heroku-only)\)
|
||||||
|
|
||||||
Notes:
|
Notes:
|
||||||
- Requires a (free) Heroku account
|
- Requires a (free) Heroku account
|
||||||
- Sometimes has issues with auto-redirecting to `https`. Make sure to navigate to the `https` version of your app before adding as a default search engine.
|
- Sometimes has issues with auto-redirecting to `https`. Make sure to navigate to the `https` version of your app before adding as a default search engine.
|
||||||
|
|
||||||
|
@ -196,7 +196,7 @@ Description=Whoogle
|
||||||
#Environment=WHOOGLE_PROXY_LOC=<proxy host/ip>
|
#Environment=WHOOGLE_PROXY_LOC=<proxy host/ip>
|
||||||
# Site alternative configurations, uncomment to enable
|
# Site alternative configurations, uncomment to enable
|
||||||
# Note: If not set, the feature will still be available
|
# Note: If not set, the feature will still be available
|
||||||
# with default values.
|
# with default values.
|
||||||
#Environment=WHOOGLE_ALT_TW=nitter.net
|
#Environment=WHOOGLE_ALT_TW=nitter.net
|
||||||
#Environment=WHOOGLE_ALT_YT=invidious.snopyta.org
|
#Environment=WHOOGLE_ALT_YT=invidious.snopyta.org
|
||||||
#Environment=WHOOGLE_ALT_IG=bibliogram.art/u
|
#Environment=WHOOGLE_ALT_IG=bibliogram.art/u
|
||||||
|
@ -422,7 +422,7 @@ Note: You should have your own domain name and [an https certificate](https://le
|
||||||
- Docker image: Set the environment variable HTTPS_ONLY=1
|
- Docker image: Set the environment variable HTTPS_ONLY=1
|
||||||
- Pip/Pipx: Add the `--https-only` flag to the end of the `whoogle-search` command
|
- Pip/Pipx: Add the `--https-only` flag to the end of the `whoogle-search` command
|
||||||
- Default `run` script: Modify the script locally to include the `--https-only` flag at the end of the python run command
|
- Default `run` script: Modify the script locally to include the `--https-only` flag at the end of the python run command
|
||||||
|
|
||||||
### Using with Firefox Containers
|
### Using with Firefox Containers
|
||||||
Unfortunately, Firefox Containers do not currently pass through `POST` requests (the default) to the engine, and Firefox caches the opensearch template on initial page load. To get around this, you can take the following steps to get it working as expected:
|
Unfortunately, Firefox Containers do not currently pass through `POST` requests (the default) to the engine, and Firefox caches the opensearch template on initial page load. To get around this, you can take the following steps to get it working as expected:
|
||||||
|
|
||||||
|
@ -457,7 +457,7 @@ Under the hood, Whoogle is a basic Flask app with the following structure:
|
||||||
- CSS/Javascript files, should be self-explanatory
|
- CSS/Javascript files, should be self-explanatory
|
||||||
- `static/settings`
|
- `static/settings`
|
||||||
- Key-value JSON files for establishing valid configuration values
|
- Key-value JSON files for establishing valid configuration values
|
||||||
|
|
||||||
|
|
||||||
If you're new to the project, the easiest way to get started would be to try fixing [an open bug report](https://github.com/benbusby/whoogle-search/issues?q=is%3Aissue+is%3Aopen+label%3Abug). If there aren't any open, or if the open ones are too stale, try taking on a [feature request](https://github.com/benbusby/whoogle-search/issues?q=is%3Aissue+is%3Aopen+label%3Aenhancement). Generally speaking, if you can write something that has any potential of breaking down in the future, you should write a test for it.
|
If you're new to the project, the easiest way to get started would be to try fixing [an open bug report](https://github.com/benbusby/whoogle-search/issues?q=is%3Aissue+is%3Aopen+label%3Abug). If there aren't any open, or if the open ones are too stale, try taking on a [feature request](https://github.com/benbusby/whoogle-search/issues?q=is%3Aissue+is%3Aopen+label%3Aenhancement). Generally speaking, if you can write something that has any potential of breaking down in the future, you should write a test for it.
|
||||||
|
|
||||||
|
@ -476,7 +476,7 @@ def contains(x: list, y: int) -> bool:
|
||||||
"""
|
"""
|
||||||
|
|
||||||
return y in x
|
return y in x
|
||||||
```
|
```
|
||||||
|
|
||||||
#### Translating
|
#### Translating
|
||||||
|
|
||||||
|
@ -509,6 +509,7 @@ A lot of the app currently piggybacks on Google's existing support for fetching
|
||||||
| [https://search.exonip.de](https://search.exonip.de) | 🇳🇱 NL | Multi-choice | |
|
| [https://search.exonip.de](https://search.exonip.de) | 🇳🇱 NL | Multi-choice | |
|
||||||
| [https://s.alefvanoon.xyz](https://s.alefvanoon.xyz) | 🇺🇸 US | Multi-choice | ✅ |
|
| [https://s.alefvanoon.xyz](https://s.alefvanoon.xyz) | 🇺🇸 US | Multi-choice | ✅ |
|
||||||
| [https://www.whooglesearch.ml](https://www.whooglesearch.ml) | 🇺🇸 US | English | |
|
| [https://www.whooglesearch.ml](https://www.whooglesearch.ml) | 🇺🇸 US | English | |
|
||||||
|
| [https://search.sethforprivacy.com](https://search.sethforprivacy.com) | 🇩🇪 DE | English | |
|
||||||
|
|
||||||
* A checkmark in the "Cloudflare" category here refers to the use of the reverse proxy, [Cloudflare](https://cloudflare.com). The checkmark will not be listed for a site which uses Cloudflare DNS but rather the proxying service which grants Cloudflare the ability to monitor traffic to the website.
|
* A checkmark in the "Cloudflare" category here refers to the use of the reverse proxy, [Cloudflare](https://cloudflare.com). The checkmark will not be listed for a site which uses Cloudflare DNS but rather the proxying service which grants Cloudflare the ability to monitor traffic to the website.
|
||||||
|
|
||||||
|
@ -516,7 +517,8 @@ A lot of the app currently piggybacks on Google's existing support for fetching
|
||||||
|
|
||||||
| Website | Country | Language |
|
| Website | Country | Language |
|
||||||
|-|-|-|
|
|-|-|-|
|
||||||
| [http://whoglqjdkgt2an4tdepberwqz3hk7tjo4kqgdnuj77rt7nshw2xqhqad.onion](http://whoglqjdkgt2an4tdepberwqz3hk7tjo4kqgdnuj77rt7nshw2xqhqad.onion) | 🇺🇸 US | Multi-choice
|
| [http://whoglqjdkgt2an4tdepberwqz3hk7tjo4kqgdnuj77rt7nshw2xqhqad.onion](http://whoglqjdkgt2an4tdepberwqz3hk7tjo4kqgdnuj77rt7nshw2xqhqad.onion) | 🇺🇸 US | Multi-choice
|
||||||
|
| [http://nuifgsnbb2mcyza74o7illtqmuaqbwu4flam3cdmsrnudwcmkqur37qd.onion](http://nuifgsnbb2mcyza74o7illtqmuaqbwu4flam3cdmsrnudwcmkqur37qd.onion) | 🇩🇪 DE | English
|
||||||
|
|
||||||
## Screenshots
|
## Screenshots
|
||||||
#### Desktop
|
#### Desktop
|
||||||
|
|
|
@ -1,3 +1,4 @@
|
||||||
|
from app.models.config import Config
|
||||||
from app.models.endpoint import Endpoint
|
from app.models.endpoint import Endpoint
|
||||||
from app.request import VALID_PARAMS, MAPS_URL
|
from app.request import VALID_PARAMS, MAPS_URL
|
||||||
from app.utils.misc import read_config_bool
|
from app.utils.misc import read_config_bool
|
||||||
|
@ -45,18 +46,8 @@ class Filter:
|
||||||
# type result (such as "people also asked", "related searches", etc)
|
# type result (such as "people also asked", "related searches", etc)
|
||||||
RESULT_CHILD_LIMIT = 7
|
RESULT_CHILD_LIMIT = 7
|
||||||
|
|
||||||
def __init__(self, user_key: str, mobile=False, config=None) -> None:
|
def __init__(self, user_key: str, config: Config, mobile=False) -> None:
|
||||||
if config is None:
|
self.config = config
|
||||||
config = {}
|
|
||||||
self.near = config['near'] if 'near' in config else ''
|
|
||||||
self.dark = config['dark'] if 'dark' in config else False
|
|
||||||
self.nojs = config['nojs'] if 'nojs' in config else False
|
|
||||||
self.new_tab = config['new_tab'] if 'new_tab' in config else False
|
|
||||||
self.alt_redirect = config['alts'] if 'alts' in config else False
|
|
||||||
self.block_title = (
|
|
||||||
config['block_title'] if 'block_title' in config else '')
|
|
||||||
self.block_url = (
|
|
||||||
config['block_url'] if 'block_url' in config else '')
|
|
||||||
self.mobile = mobile
|
self.mobile = mobile
|
||||||
self.user_key = user_key
|
self.user_key = user_key
|
||||||
self.main_divs = ResultSet('')
|
self.main_divs = ResultSet('')
|
||||||
|
@ -69,16 +60,6 @@ class Filter:
|
||||||
def elements(self):
|
def elements(self):
|
||||||
return self._elements
|
return self._elements
|
||||||
|
|
||||||
def reskin(self, page: str) -> str:
|
|
||||||
# Aesthetic only re-skinning
|
|
||||||
if self.dark:
|
|
||||||
page = page.replace(
|
|
||||||
'fff', '000').replace(
|
|
||||||
'202124', 'ddd').replace(
|
|
||||||
'1967D2', '3b85ea')
|
|
||||||
|
|
||||||
return page
|
|
||||||
|
|
||||||
def encrypt_path(self, path, is_element=False) -> str:
|
def encrypt_path(self, path, is_element=False) -> str:
|
||||||
# Encrypts path to avoid plaintext results in logs
|
# Encrypts path to avoid plaintext results in logs
|
||||||
if is_element:
|
if is_element:
|
||||||
|
@ -109,7 +90,7 @@ class Filter:
|
||||||
|
|
||||||
input_form = soup.find('form')
|
input_form = soup.find('form')
|
||||||
if input_form is not None:
|
if input_form is not None:
|
||||||
input_form['method'] = 'POST'
|
input_form['method'] = 'GET' if self.config.get_only else 'POST'
|
||||||
|
|
||||||
# Ensure no extra scripts passed through
|
# Ensure no extra scripts passed through
|
||||||
for script in soup('script'):
|
for script in soup('script'):
|
||||||
|
@ -143,9 +124,7 @@ class Filter:
|
||||||
_ = div.decompose() if len(div_ads) else None
|
_ = div.decompose() if len(div_ads) else None
|
||||||
|
|
||||||
def remove_block_titles(self) -> None:
|
def remove_block_titles(self) -> None:
|
||||||
if not self.main_divs:
|
if not self.main_divs or not self.config.block_title:
|
||||||
return
|
|
||||||
if self.block_title == '':
|
|
||||||
return
|
return
|
||||||
block_title = re.compile(self.block_title)
|
block_title = re.compile(self.block_title)
|
||||||
for div in [_ for _ in self.main_divs.find_all('div', recursive=True)]:
|
for div in [_ for _ in self.main_divs.find_all('div', recursive=True)]:
|
||||||
|
@ -154,9 +133,7 @@ class Filter:
|
||||||
_ = div.decompose() if len(block_divs) else None
|
_ = div.decompose() if len(block_divs) else None
|
||||||
|
|
||||||
def remove_block_url(self) -> None:
|
def remove_block_url(self) -> None:
|
||||||
if not self.main_divs:
|
if not self.main_divs or not self.config.block_url:
|
||||||
return
|
|
||||||
if self.block_url == '':
|
|
||||||
return
|
return
|
||||||
block_url = re.compile(self.block_url)
|
block_url = re.compile(self.block_url)
|
||||||
for div in [_ for _ in self.main_divs.find_all('div', recursive=True)]:
|
for div in [_ for _ in self.main_divs.find_all('div', recursive=True)]:
|
||||||
|
@ -244,7 +221,7 @@ class Filter:
|
||||||
if src.startswith(LOGO_URL):
|
if src.startswith(LOGO_URL):
|
||||||
# Re-brand with Whoogle logo
|
# Re-brand with Whoogle logo
|
||||||
element.replace_with(BeautifulSoup(
|
element.replace_with(BeautifulSoup(
|
||||||
render_template('logo.html', dark=self.dark),
|
render_template('logo.html'),
|
||||||
features='html.parser'))
|
features='html.parser'))
|
||||||
return
|
return
|
||||||
elif src.startswith(GOOG_IMG) or GOOG_STATIC in src:
|
elif src.startswith(GOOG_IMG) or GOOG_STATIC in src:
|
||||||
|
@ -323,10 +300,10 @@ class Filter:
|
||||||
link['href'] = filter_link_args(q)
|
link['href'] = filter_link_args(q)
|
||||||
|
|
||||||
# Add no-js option
|
# Add no-js option
|
||||||
if self.nojs:
|
if self.config.nojs:
|
||||||
append_nojs(link)
|
append_nojs(link)
|
||||||
|
|
||||||
if self.new_tab:
|
if self.config.new_tab:
|
||||||
link['target'] = '_blank'
|
link['target'] = '_blank'
|
||||||
else:
|
else:
|
||||||
if href.startswith(MAPS_URL):
|
if href.startswith(MAPS_URL):
|
||||||
|
@ -336,7 +313,7 @@ class Filter:
|
||||||
link['href'] = href
|
link['href'] = href
|
||||||
|
|
||||||
# Replace link location if "alts" config is enabled
|
# Replace link location if "alts" config is enabled
|
||||||
if self.alt_redirect:
|
if self.config.alts:
|
||||||
# Search and replace all link descriptions
|
# Search and replace all link descriptions
|
||||||
# with alternative location
|
# with alternative location
|
||||||
link['href'] = get_site_alt(link['href'])
|
link['href'] = get_site_alt(link['href'])
|
||||||
|
|
|
@ -59,7 +59,7 @@ def gen_user_agent(is_mobile) -> str:
|
||||||
return DESKTOP_UA.format("Mozilla", linux, firefox)
|
return DESKTOP_UA.format("Mozilla", linux, firefox)
|
||||||
|
|
||||||
|
|
||||||
def gen_query(query, args, config, near_city=None) -> str:
|
def gen_query(query, args, config) -> str:
|
||||||
param_dict = {key: '' for key in VALID_PARAMS}
|
param_dict = {key: '' for key in VALID_PARAMS}
|
||||||
|
|
||||||
# Use :past(hour/day/week/month/year) if available
|
# Use :past(hour/day/week/month/year) if available
|
||||||
|
@ -96,8 +96,8 @@ def gen_query(query, args, config, near_city=None) -> str:
|
||||||
param_dict['start'] = '&start=' + args.get('start')
|
param_dict['start'] = '&start=' + args.get('start')
|
||||||
|
|
||||||
# Search for results near a particular city, if available
|
# Search for results near a particular city, if available
|
||||||
if near_city:
|
if config.near:
|
||||||
param_dict['near'] = '&near=' + urlparse.quote(near_city)
|
param_dict['near'] = '&near=' + urlparse.quote(config.near)
|
||||||
|
|
||||||
# Set language for results (lr) if source isn't set, otherwise use the
|
# Set language for results (lr) if source isn't set, otherwise use the
|
||||||
# result language param provided in the results
|
# result language param provided in the results
|
||||||
|
|
|
@ -119,8 +119,7 @@ class Search:
|
||||||
config=self.config)
|
config=self.config)
|
||||||
full_query = gen_query(self.query,
|
full_query = gen_query(self.query,
|
||||||
self.request_params,
|
self.request_params,
|
||||||
self.config,
|
self.config)
|
||||||
content_filter.near)
|
|
||||||
|
|
||||||
# force mobile search when view image is true and
|
# force mobile search when view image is true and
|
||||||
# the request is not already made by a mobile
|
# the request is not already made by a mobile
|
||||||
|
@ -132,7 +131,7 @@ class Search:
|
||||||
force_mobile=view_image)
|
force_mobile=view_image)
|
||||||
|
|
||||||
# Produce cleanable html soup from response
|
# Produce cleanable html soup from response
|
||||||
html_soup = bsoup(content_filter.reskin(get_body.text), 'html.parser')
|
html_soup = bsoup(get_body.text, 'html.parser')
|
||||||
|
|
||||||
# Replace current soup if view_image is active
|
# Replace current soup if view_image is active
|
||||||
if view_image:
|
if view_image:
|
||||||
|
|
|
@ -1,5 +1,6 @@
|
||||||
from bs4 import BeautifulSoup
|
from bs4 import BeautifulSoup
|
||||||
from app.filter import Filter
|
from app.filter import Filter
|
||||||
|
from app.models.config import Config
|
||||||
from app.models.endpoint import Endpoint
|
from app.models.endpoint import Endpoint
|
||||||
from app.utils.session import generate_user_key
|
from app.utils.session import generate_user_key
|
||||||
from datetime import datetime
|
from datetime import datetime
|
||||||
|
@ -11,7 +12,7 @@ from test.conftest import demo_config
|
||||||
|
|
||||||
def get_search_results(data):
|
def get_search_results(data):
|
||||||
secret_key = generate_user_key()
|
secret_key = generate_user_key()
|
||||||
soup = Filter(user_key=secret_key).clean(
|
soup = Filter(user_key=secret_key, config=Config(**demo_config)).clean(
|
||||||
BeautifulSoup(data, 'html.parser'))
|
BeautifulSoup(data, 'html.parser'))
|
||||||
|
|
||||||
main_divs = soup.find('div', {'id': 'main'})
|
main_divs = soup.find('div', {'id': 'main'})
|
||||||
|
@ -74,7 +75,7 @@ def test_block_results(client):
|
||||||
|
|
||||||
assert has_pinterest
|
assert has_pinterest
|
||||||
|
|
||||||
demo_config['block'] = 'pinterest.com'
|
demo_config['block'] = 'pinterest.com,help.pinterest.com'
|
||||||
rv = client.post(f'/{Endpoint.config}', data=demo_config)
|
rv = client.post(f'/{Endpoint.config}', data=demo_config)
|
||||||
assert rv._status_code == 302
|
assert rv._status_code == 302
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue
Block a user