Spoon is a library for building Distributed Proxy Pool for each different sites as you assign.
Only running on python 3.
Simply run: pip install spoonproxy
or clone the repo and set it into your PYTHONPATH.
Please make sure the Redis is running. Default configuration is "host:localhost, port:6379". You can also modify the Redis connection.
Like example.py
in spoon_server/example
,
You can assign many different proxy providers.
from spoon_server.proxy.fetcher import Fetcher
from spoon_server.main.proxy_pipe import ProxyPipe
from spoon_server.proxy.kuai_provider import KuaiProvider
from spoon_server.proxy.xici_provider import XiciProvider
from spoon_server.database.redis_config import RedisConfig
from spoon_server.main.checker import CheckerBaidu
def main_run():
redis = RedisConfig("127.0.0.1", 21009)
p1 = ProxyPipe(url_prefix="https://www.baidu.com",
fetcher=Fetcher(use_default=False),
database=redis,
checker=CheckerBaidu()).set_fetcher([KuaiProvider()]).add_fetcher([XiciProvider()])
p1.start()
if __name__ == '__main__':
main_run()
Also, with different checker, you can validate the result precisely.
class CheckerBaidu(Checker):
def checker_func(self, html=None):
if isinstance(html, bytes):
html = html.decode('utf-8')
if re.search(r".*百度一下,你就知道.*", html):
return True
else:
return False
Also, as the code shows in spoon_server/example/example_multi.py
, by using multiprocess, you can get many queues to fetching & validating the proxies.
You can also assign different Providers for different url.
The default proxy providers are shown below, you can write your own providers.
name | description |
---|---|
WebProvider | Get proxy from http api |
FileProvider | Get proxy from file |
GouProvider | http://www.goubanjia.com |
KuaiProvider | http://www.kuaidaili.com |
SixProvider | http://m.66ip.cn |
UsProvider | https://www.us-proxy.org |
WuyouProvider | http://www.data5u.com |
XiciProvider | http://www.xicidaili.com |
IP181Provider | http://www.ip181.com |
XunProvider | http://www.xdaili.cn |
PlpProvider | https://list.proxylistplus.com |
IP3366Provider | http://www.ip3366.net |
BusyProvider | https://proxy.coderbusy.com |
NianProvider | http://www.nianshao.me |
PdbProvider | http://proxydb.net |
ZdayeProvider | http://ip.zdaye.com |
YaoProvider | http://www.httpsdaili.com/ |
FeilongProvider | http://www.feilongip.com/ |
IP31Provider | https://31f.cn/http-proxy/ |
XiaohexiaProvider | http://www.xiaohexia.cn/ |
CoolProvider | https://www.cool-proxy.net/ |
NNtimeProvider | http://nntime.com/ |
ListendeProvider | https://www.proxy-listen.de/ |
IhuanProvider | https://ip.ihuan.me/ |
IphaiProvider | http://www.iphai.com/ |
MimvpProvider(@NeedCaptcha) | https://proxy.mimvp.com/ |
GPProvider(@NeedProxy if you're in China) | http://www.gatherproxy.com |
FPLProvider(@NeedProxy if you're in China) | https://free-proxy-list.net |
SSLProvider(@NeedProxy if you're in China) | https://www.sslproxies.org |
NordProvider(@NeedProxy if you're in China) | https://nordvpn.com |
PremProvider(@NeedProxy if you're in China) | https://premproxy.com |
YouProvider(@Deprecated) | http://www.youdaili.net |
A Simple django web api demo. You could use any web server and write your own api.
Gently run python manager.py runserver **.**.**.**:*****
The simple apis include:
name | description |
---|---|
http://127.0.0.1:21010/api/v1/get_keys | Get all keys from redis |
http://127.0.0.1:21010/api/v1/fetchone_from?target=www.google.com&filter=65 | Get one useful proxy. target: the specific url filter: successful-revalidate times |
http://127.0.0.1:21010/api/v1/fetchall_from?target=www.google.com&filter=65 | Get all useful proxies. |
http://127.0.0.1:21010/api/v1/fetch_hundred_recent?target=www.baidu.com&filter=5 | Get recently joined full-scored proxies. target: the specific url filter: time in seconds |
http://127.0.0.1:21010/api/v1/fetch_stale?num=100 | Get recently proxies without check. num: the specific number of proxies you want |
http://127.0.0.1:21010/api/v1/fetch_recent?target=www.baidu.com | Get recently proxies that successfully validated. target: the specific url |