Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

0 tweets #296

Open
etemiz opened this issue Jun 2, 2020 · 104 comments
Open

0 tweets #296

etemiz opened this issue Jun 2, 2020 · 104 comments

Comments

@etemiz
Copy link

etemiz commented Jun 2, 2020

INFO: Retrying... (Attempts left: 1)
INFO: Scraping tweets from https://twitter.com/search?f=tweets&vertical=default&q=bitcoin&l=
INFO: Using proxy 181.211.38.62:47911
INFO: Got 0 tweets for bitcoin.

Parsing may be an issue.
Both twitterscraper (0.9.3) and (1.4.0) are failing.

@mickyscreggs
Copy link

Have also been facing this issue. Queries that were returning tweets yesterday are not returning tweets today.

@ravishankarramakrishnan
Copy link

I'm also facing the Same issue! Yesterday it was parsing well, but today it returns 0 tweets

@xtr32
Copy link

xtr32 commented Jun 2, 2020

same here 0 tweets

1 similar comment
@yiw0104
Copy link

yiw0104 commented Jun 2, 2020

same here 0 tweets

@tengfei7890
Copy link

Seems Twitter has restricted the connection so that all requests return a page with
"We've detected that JavaScript is disabled in your browser. Would you like to proceed to legacy Twitter?"

@panoptikum
Copy link

+1. That's bad.

@hakanyusufoglu
Copy link

INFO: Retrying... (Attempts left: 1)
INFO: Scraping tweets from https://twitter.com/search?f=tweets&vertical=default&q=bitcoin&l=
INFO: Using proxy 181.211.38.62:47911
INFO: Got 0 tweets for bitcoin.

Parsing may be an issue.
Both twitterscraper (0.9.3) and (1.4.0) are failing.

hocam bende bir proje geliştirmiştim projemde ana kısım buna bağlı bu sorunu nasıl düzeltebiliriz

@hakanyusufoglu
Copy link

I need help

@toscanopedro
Copy link

same here... anyone has a clue for whats going on?

@hakanyusufoglu
Copy link

Not yet. I used it for school university project. What will I do during the presentation

@rubengoeminne
Copy link

Seems Twitter has restricted the connection so that all requests return a page with
"We've detected that JavaScript is disabled in your browser. Would you like to proceed to legacy Twitter?"

Indeed, this can be fixed by modifying the header dictionary in query.py from
HEADER = {'User-Agent': random.choice(HEADERS_LIST)}
to
HEADER = {'User-Agent': random.choice(HEADERS_LIST), 'X-Requested-With': 'XMLHttpRequest'}
that should fix the issue.

@locchipinti
Copy link

Seems Twitter has restricted the connection so that all requests return a page with
"We've detected that JavaScript is disabled in your browser. Would you like to proceed to legacy Twitter?"

Indeed, this can be fixed by modifying the header dictionary in query.py from
HEADER = {'User-Agent': random.choice(HEADERS_LIST)}
to
HEADER = {'User-Agent': random.choice(HEADERS_LIST), 'X-Requested-With': 'XMLHttpRequest'}
that should fix the issue.

It works for me!
Thanks @rubengoeminne, genius!

@hakanyusufoglu
Copy link

Thanks

Seems Twitter has restricted the connection so that all requests return a page with
"We've detected that JavaScript is disabled in your browser. Would you like to proceed to legacy Twitter?"

Indeed, this can be fixed by modifying the header dictionary in query.py from
HEADER = {'User-Agent': random.choice(HEADERS_LIST)}
to
HEADER = {'User-Agent': random.choice(HEADERS_LIST), 'X-Requested-With': 'XMLHttpRequest'}
that should fix the issue.

I am very thank you. its work.

@xtr32
Copy link

xtr32 commented Jun 2, 2020

Seems Twitter has restricted the connection so that all requests return a page with
"We've detected that JavaScript is disabled in your browser. Would you like to proceed to legacy Twitter?"

Indeed, this can be fixed by modifying the header dictionary in query.py from
HEADER = {'User-Agent': random.choice(HEADERS_LIST)}
to
HEADER = {'User-Agent': random.choice(HEADERS_LIST), 'X-Requested-With': 'XMLHttpRequest'}
that should fix the issue.

its work.. thanks

@toscanopedro
Copy link

hi guys, im a kind of noob and do not have a HEADER in my code... someone can tell how can i implement it?

@GivenToFlyCoder
Copy link

GivenToFlyCoder commented Jun 2, 2020

Seems Twitter has restricted the connection so that all requests return a page with
"We've detected that JavaScript is disabled in your browser. Would you like to proceed to legacy Twitter?"

Indeed, this can be fixed by modifying the header dictionary in query.py from
HEADER = {'User-Agent': random.choice(HEADERS_LIST)}
to
HEADER = {'User-Agent': random.choice(HEADERS_LIST), 'X-Requested-With': 'XMLHttpRequest'}
that should fix the issue.

Thanks a lot my friend! This worked for me! You are a genius!
Let me share you a beer @rubengoeminne! Paulaner German Beer? Or Negra Modelo Mexican Beer?

@GivenToFlyCoder
Copy link

GivenToFlyCoder commented Jun 2, 2020

hi guys, im a kind of noob and do not have a HEADER in my code... someone can tell how can i implement it?

@toscanopedro The header dictionary: HEADER = {'User-Agent': random.choice(HEADERS_LIST)} is not in your own code, instead it is a line inside the file query.py

Just open the file as TXT, and change the lines, such as @rubengoeminne said. You could search the file in your PC, maybe it will be foun at the path: C:\ProgramData\Anaconda3\Lib\site-packages\twitterscraper

@toscanopedro
Copy link

hi guys, im a kind of noob and do not have a HEADER in my code... someone can tell how can i implement it?

@toscanopedro The header dictionary: HEADER = {'User-Agent': random.choice(HEADERS_LIST)} is not in your own code, instead it is a line inside the file query.py

Just open the file as TXT, and change the lines, such as @rubengoeminne said. You could search the file in your PC, maybe it will be foun at the path: C:\ProgramData\Anaconda3\Lib\site-packages\twitterscraper

THX MAN!!!!

@yiw0104
Copy link

yiw0104 commented Jun 2, 2020

Seems Twitter has restricted the connection so that all requests return a page with
"We've detected that JavaScript is disabled in your browser. Would you like to proceed to legacy Twitter?"

Indeed, this can be fixed by modifying the header dictionary in query.py from
HEADER = {'User-Agent': random.choice(HEADERS_LIST)}
to
HEADER = {'User-Agent': random.choice(HEADERS_LIST), 'X-Requested-With': 'XMLHttpRequest'}
that should fix the issue.

The modification no longer works for query_user_info. I changed the header dictionary in query.py and still got no information on my list of users.

@AlexBietrix
Copy link

I faced the same issue. It seems to work now to retrieve the tweets. However I get this error when I want to have user info, using query_user_info : local variable 'user_info' referenced before assignment

@mardiaz353
Copy link

Yah it is not working for me. Changed that line in query.py and same issue occurs.

@wal-iston
Copy link

Hi. I have implemented the modification suggested by pumpkinw and the algortihm made progress. It was not scraping anything before modification. But after modification it is scraping, but not everything. It seems it is scraping only some last hours. For example, when I issued:

twitterscraper fascismo --lang pt -p 1 -bd 2020-05-31 -ed 2020-06-01 -o file_name.json

I received tweets corresponding only to hours from 20 up to 23 of day 2020-05-31:

In [12]: df.groupby(df['timestamp'].dt.hour).count()


Out[12]:
has_media hashtags img_urls is_replied ... tweet_url user_id username video_url
timestamp ...
20 956 956 956 956 ... 956 956 956 956
21 2384 2384 2384 2384 ... 2384 2384 2384 2384
22 2100 2100 2100 2100 ... 2100 2100 2100 2100
23 2147 2147 2147 2147 ... 2147 2147 2147 2147

[4 rows x 21 columns]


Somebody know what is going on?

atharva-lipare referenced this issue in atharva-lipare/twitterscraper Jun 3, 2020
@Frickson
Copy link

Frickson commented Jun 3, 2020

already changed the header from HEADER = {'User-Agent': random.choice(HEADERS_LIST)} to
HEADER = {'User-Agent': random.choice(HEADERS_LIST), 'X-Requested-With': 'XMLHttpRequest'} but still have the same issue 'NoneType' object has no attribute 'user'.

@javad94
Copy link

javad94 commented Jun 3, 2020

I don't like modifying module's files directly, so instead of that and based on @rubengoeminne's great answer, to fix this issue you just have to add these line of codes to the top of your python script:

import twitterscraper
import random
HEADERS_LIST = [
    'Mozilla/5.0 (Windows; U; Windows NT 6.1; x64; fr; rv:1.9.2.13) Gecko/20101203 Firebird/3.6.13',
    'Mozilla/5.0 (compatible, MSIE 11, Windows NT 6.3; Trident/7.0; rv:11.0) like Gecko',
    'Mozilla/5.0 (Windows; U; Windows NT 6.1; rv:2.2) Gecko/20110201',
    'Opera/9.80 (X11; Linux i686; Ubuntu/14.10) Presto/2.12.388 Version/12.16',
    'Mozilla/5.0 (Windows NT 5.2; RW; rv:7.0a1) Gecko/20091211 SeaMonkey/9.23a1pre'
]
twitterscraper.query.HEADER = {'User-Agent': random.choice(HEADERS_LIST), 'X-Requested-With': 'XMLHttpRequest'}

And do your stuff normally:

from twitterscraper import query_tweets
query_tweets("github", 100)

@Marlowe97
Copy link

Seems Twitter has restricted the connection so that all requests return a page with
"We've detected that JavaScript is disabled in your browser. Would you like to proceed to legacy Twitter?"

Indeed, this can be fixed by modifying the header dictionary in query.py from
HEADER = {'User-Agent': random.choice(HEADERS_LIST)}
to
HEADER = {'User-Agent': random.choice(HEADERS_LIST), 'X-Requested-With': 'XMLHttpRequest'}
that should fix the issue.

This solution seems not to work for me now.

@javad94
Copy link

javad94 commented Jun 4, 2020

Seems Twitter has restricted the connection so that all requests return a page with
"We've detected that JavaScript is disabled in your browser. Would you like to proceed to legacy Twitter?"

Indeed, this can be fixed by modifying the header dictionary in query.py from
HEADER = {'User-Agent': random.choice(HEADERS_LIST)}
to
HEADER = {'User-Agent': random.choice(HEADERS_LIST), 'X-Requested-With': 'XMLHttpRequest'}
that should fix the issue.

This solution seems not to work for me now.

Yeah, unfortunately they close it down.

@toscanopedro
Copy link

guys are you sure that you replace the correct arquive? this is still working for me

@zhicheng0501
Copy link

zhicheng0501 commented Jun 24, 2020

@zhicheng0501 perhaps you have a conflicting dependency version? Try

python3 -m venv .venv
source .venv/bin/activate
python3 setup.py install

This is what it shows.
It seems that i tried but still failed.
May i ask how to use selenium method?
I followed 302 steps and install selenium, geckodriver and firefox.
What should i do next?
Would you please show me a screen of sample code you use to run selenium in this case?
屏幕快照 2020-06-24 10 24 47

@zhicheng0501
Copy link

zhicheng0501 commented Jun 24, 2020

@zhicheng0501 perhaps you have a conflicting dependency version? Try

python3 -m venv .venv
source .venv/bin/activate
python3 setup.py install

Thanks man
That really works.
I know what problem i came across these days.
Twitter blocked the keyword nCoV. That's why i suffered from the retrying feedback.
At least on my end, it is what i saw.
I do not know why but it is what occurred and confused me these days.
I am writing to Twitter why they block it.
Once i got a clue, I will let the others know which will help to improve your project.

MOre than that above, would you please tell how to run javascript method in selenium.
i created a new environment using the method you mentioned above and tried the following code but failed:
twitterscraper trump --javascript -bd 2020-04-01 -ed 2020-04-02 -o trump.json

@lapp0
Copy link
Collaborator

lapp0 commented Jun 24, 2020

@Frickson please share your command and output

@jassena
Copy link

jassena commented Jun 24, 2020

@jassena what is the command you're running and full output? (please text, no screenshot)

hey i just run the code that u mention above...actually i also got 0 tweets...then i change the query.py header that u just told...at that time i got this error
image
this is my code

@lapp0
Copy link
Collaborator

lapp0 commented Jun 24, 2020

@jassena What is "this error"? Please share your code and the full error here http://gist.github.com/

@Toby-masuku
Copy link

Tried everything, still getting 0

@lapp0
Copy link
Collaborator

lapp0 commented Jun 26, 2020

@Toby-masuku are you using origin/master? The latest version on pypi doesn't work.

@Toby-masuku
Copy link

@lapp0 yes I'm using the origin master

@zhicheng0501
Copy link

Tried everything, still getting 0

What is the keyword of your query? I am searching Trump and it works fine. But it fails as i search nCoV and Wuhancoronavirus.

@Toby-masuku
Copy link

@zhicheng0501 key word climate change

@lapp0
Copy link
Collaborator

lapp0 commented Jun 26, 2020

what is your exact command and what is your output? Please paste it.

@zhicheng0501
Copy link

zhicheng0501 commented Jun 26, 2020

@Toby-masuku
I used pip install twitterscraper and run this code searching "climate change".
It works just fine.
Just let you know. It is okay on my end.

Last login: Fri Feb 28 18:20:18 on ttys000
bogon:~ zhaoningning$ twitterscraper "climate change" --lang de --limit 10000000000000000 -bd 2020-04-27 -ed 2020-04-28 -o wuhan04270428.json
INFO: {'User-Agent': 'Mozilla/5.0 (Windows NT 5.2; RW; rv:7.0a1) Gecko/20091211 SeaMonkey/9.23a1pre', 'X-Requested-With': 'XMLHttpRequest'}
INFO: queries: ['climate change since:2020-04-27 until:2020-04-28']
INFO: Querying climate change since:2020-04-27 until:2020-04-28
INFO: Scraping tweets from https://twitter.com/search?f=tweets&vertical=default&q=climate%20change%20since%3A2020-04-27%20until%3A2020-04-28&l=de
INFO: Using proxy 103.102.15.90:10714
INFO: Scraping tweets from https://twitter.com/i/search/timeline?f=tweets&vertical=default&include_available_features=1&include_entities=1&reset_error_state=false&src=typd&max_position=TWEET-1254687772931129344-1254876447442964480&q=climate%20change%20since%3A2020-04-27%20until%3A2020-04-28&l=de
INFO: Using proxy 113.11.156.42:31935
INFO: Scraping tweets from https://twitter.com/i/search/timeline?f=tweets&vertical=default&include_available_features=1&include_entities=1&reset_error_state=false&src=typd&max_position=thGAVUV0VFVBaAgLv9ydfF6SIWgMC87d7em-oiEjUAFQAlAFUAFQAA&q=climate%20change%20since%3A2020-04-27%20until%3A2020-04-28&l=de
INFO: Using proxy 118.174.196.112:36314
INFO: Scraping tweets from https://twitter.com/i/search/timeline?f=tweets&vertical=default&include_available_features=1&include_entities=1&reset_error_state=false&src=typd&max_position=thGAVUV0VFVBaCwL3t67Kf6SIWgMC87d7em-oiEjUAFQAlAFUAFQAA&q=climate%20change%20since%3A2020-04-27%20until%3A2020-04-28&l=de
INFO: Using proxy 41.63.170.142:8080
INFO: Twitter returned : 'has_more_items'
INFO: Got 38 tweets for climate%20change%20since%3A2020-04-27%20until%3A2020-04-28.
INFO: Got 38 tweets (38 new).

@chanhee-kang
Copy link

@zhicheng0501
HI, tired with "covid19" as searching query but i failed.. do you know the reason?

@zhicheng0501
Copy link

@zhicheng0501
HI, tired with "covid19" as searching query but i failed.. do you know the reason?

@chanhee-kang
it seems that covid19 searching works fine on my end at this moment. Could you please try ncov and search date ranging from 5-31 to 6-1 and tell me if it works on your end? it fails here on my end.
When i searched ncov, it worked well but it failed since i parsed a lot of data.
I guess it might trigger some mechanism of twitter.

This is the command and result that i am doing of covid19:
bogon:ncov zhaoningning$ twitterscraper COVID-19 --lang de --limit 100000000 -bd 2020-05-31 -ed 2020-06-01 -o wuhan05310601.json
INFO: {'User-Agent': 'Mozilla/5.0 (Windows; U; Windows NT 6.1; rv:2.2) Gecko/20110201', 'X-Requested-With': 'XMLHttpRequest'}
INFO: queries: ['COVID-19 since:2020-05-31 until:2020-06-01']
INFO: Querying COVID-19 since:2020-05-31 until:2020-06-01
INFO: Scraping tweets from https://twitter.com/search?f=tweets&vertical=default&q=COVID-19%20since%3A2020-05-31%20until%3A2020-06-01&l=de
INFO: Using proxy 118.175.93.148:55169
INFO: Scraping tweets from https://twitter.com/i/search/timeline?f=tweets&vertical=default&include_available_features=1&include_entities=1&reset_error_state=false&src=typd&max_position=TWEET-1267229970235043842-1267243855260205056&q=COVID-19%20since%3A2020-05-31%20until%3A2020-06-01&l=de
INFO: Using proxy 201.55.160.133:3128
INFO: Scraping tweets from https://twitter.com/i/search/timeline?f=tweets&vertical=default&include_available_features=1&include_entities=1&reset_error_state=false&src=typd&max_position=thGAVUV0VFVBaEwLWJtLyNliMWgICmyc_kk5YjEjUAFQAlAFUAFQAA&q=COVID-19%20since%3A2020-05-31%20until%3A2020-06-01&l=de

@Toby-masuku
Copy link

@Toby-masuku
I used pip install twitterscraper and run this code searching "climate change".
It works just fine.
Just let you know. It is okay on my end.

Last login: Fri Feb 28 18:20:18 on ttys000
bogon:~ zhaoningning$ twitterscraper "climate change" --lang de --limit 10000000000000000 -bd 2020-04-27 -ed 2020-04-28 -o wuhan04270428.json
INFO: {'User-Agent': 'Mozilla/5.0 (Windows NT 5.2; RW; rv:7.0a1) Gecko/20091211 SeaMonkey/9.23a1pre', 'X-Requested-With': 'XMLHttpRequest'}
INFO: queries: ['climate change since:2020-04-27 until:2020-04-28']
INFO: Querying climate change since:2020-04-27 until:2020-04-28
INFO: Scraping tweets from https://twitter.com/search?f=tweets&vertical=default&q=climate%20change%20since%3A2020-04-27%20until%3A2020-04-28&l=de
INFO: Using proxy 103.102.15.90:10714
INFO: Scraping tweets from https://twitter.com/i/search/timeline?f=tweets&vertical=default&include_available_features=1&include_entities=1&reset_error_state=false&src=typd&max_position=TWEET-1254687772931129344-1254876447442964480&q=climate%20change%20since%3A2020-04-27%20until%3A2020-04-28&l=de
INFO: Using proxy 113.11.156.42:31935
INFO: Scraping tweets from https://twitter.com/i/search/timeline?f=tweets&vertical=default&include_available_features=1&include_entities=1&reset_error_state=false&src=typd&max_position=thGAVUV0VFVBaAgLv9ydfF6SIWgMC87d7em-oiEjUAFQAlAFUAFQAA&q=climate%20change%20since%3A2020-04-27%20until%3A2020-04-28&l=de
INFO: Using proxy 118.174.196.112:36314
INFO: Scraping tweets from https://twitter.com/i/search/timeline?f=tweets&vertical=default&include_available_features=1&include_entities=1&reset_error_state=false&src=typd&max_position=thGAVUV0VFVBaCwL3t67Kf6SIWgMC87d7em-oiEjUAFQAlAFUAFQAA&q=climate%20change%20since%3A2020-04-27%20until%3A2020-04-28&l=de
INFO: Using proxy 41.63.170.142:8080
INFO: Twitter returned : 'has_more_items'
INFO: Got 38 tweets for climate%20change%20since%3A2020-04-27%20until%3A2020-04-28.
INFO: Got 38 tweets (38 new).

can you please share your code, Maybe I made a mistake

@lapp0
Copy link
Collaborator

lapp0 commented Jun 26, 2020

I used pip install twitterscraper and run this code searching "climate change".

@zhicheng0501 the pip version worked for you? It doesn't have the headers fix in it. How did you get it to work?

@lapp0
Copy link
Collaborator

lapp0 commented Jun 26, 2020

@Toby-masuku looks like you got 38 results. What is the problem?

@Toby-masuku
Copy link

@lapp0 thats's not me, I got 0

Screenshot (67)

@Frickson
Copy link

@Frickson please share your command and output

Hi lapp0, I run the get_twitter_user_data.py from master/origin and just changed the list of name.

Here my code

start = time.time()
    users = ['Ms_MeiChing']

    pool = Pool(8)    
    for user in pool.map(get_user_info,users):
        twitter_user_info.append(user)
Traceback (most recent call last):
  File "C:\Users\Asus\AppData\Local\Programs\Python\Python37\lib\site-packages\twitterscraper-1.4.0-py3.7.egg\twitterscraper\query.py", line 323, in query_user_info       
    user_info = query_user_page(INIT_URL_USER.format(u=user))
  File "C:\Users\Asus\AppData\Local\Programs\Python\Python37\lib\site-packages\twitterscraper-1.4.0-py3.7.egg\twitterscraper\query.py", line 292, in query_user_page       
    user_info = User.from_html(html)
  File "C:\Users\Asus\AppData\Local\Programs\Python\Python37\lib\site-packages\twitterscraper-1.4.0-py3.7.egg\twitterscraper\user.py", line 101, in from_html
    return self.from_soup(user_profile_header, user_profile_canopy)
  File "C:\Users\Asus\AppData\Local\Programs\Python\Python37\lib\site-packages\twitterscraper-1.4.0-py3.7.egg\twitterscraper\user.py", line 57, in from_soup
    tweets = tag_prof_nav.find('span', {'class':"ProfileNav-value"})['data-count']
  File "C:\Users\Asus\AppData\Local\Programs\Python\Python37\lib\site-packages\bs4\element.py", line 1321, in __getitem__
    return self.attrs[key]
KeyError: 'data-count'
INFO: Got user information from username Ms_MeiChing
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "C:\Users\Asus\AppData\Local\Programs\Python\Python37\lib\multiprocessing\pool.py", line 121, in worker
    result = (True, func(*args, **kwds))
  File "C:\Users\Asus\AppData\Local\Programs\Python\Python37\lib\multiprocessing\pool.py", line 44, in mapstar
    return list(map(*args))
  File "c:\Users\Asus\Desktop\twitterscraper\examples\get_twitter_user_data.py", line 20, in get_user_info
    twitter_user_data["user"] = user_info.user
AttributeError: 'NoneType' object has no attribute 'user'
"""
user_data.py", line 53, in <module>
    main()
  File "c:/Users/Asus/Desktop/twitterscraper/examples/get_twitter_user_data.py", line 40, in main
    for user in pool.map(get_user_info,users):
  File "C:\Users\Asus\AppData\Local\Programs\Python\Python37\lib\multiprocessing\pool.py", line 268, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "C:\Users\Asus\AppData\Local\Programs\Python\Python37\lib\multiprocessing\pool.py", line 657, in get
    raise self._value
AttributeError: 'NoneType' object has no attribute 'user'

@erb13020
Copy link

@javad94

What exactly does this header list do?

HEADERS_LIST = [
'Mozilla/5.0 (Windows; U; Windows NT 6.1; x64; fr; rv:1.9.2.13) Gecko/20101203 Firebird/3.6.13',
'Mozilla/5.0 (compatible, MSIE 11, Windows NT 6.3; Trident/7.0; rv:11.0) like Gecko',
'Mozilla/5.0 (Windows; U; Windows NT 6.1; rv:2.2) Gecko/20110201',
'Opera/9.80 (X11; Linux i686; Ubuntu/14.10) Presto/2.12.388 Version/12.16',
'Mozilla/5.0 (Windows NT 5.2; RW; rv:7.0a1) Gecko/20091211 SeaMonkey/9.23a1pre'
]

@zhicheng0501
Copy link

@javad94

What exactly does this header list do?

HEADERS_LIST = [
'Mozilla/5.0 (Windows; U; Windows NT 6.1; x64; fr; rv:1.9.2.13) Gecko/20101203 Firebird/3.6.13',
'Mozilla/5.0 (compatible, MSIE 11, Windows NT 6.3; Trident/7.0; rv:11.0) like Gecko',
'Mozilla/5.0 (Windows; U; Windows NT 6.1; rv:2.2) Gecko/20110201',
'Opera/9.80 (X11; Linux i686; Ubuntu/14.10) Presto/2.12.388 Version/12.16',
'Mozilla/5.0 (Windows NT 5.2; RW; rv:7.0a1) Gecko/20091211 SeaMonkey/9.23a1pre'
]

Creat random header list for sending request to the server. It lets the server see requests coming from different client,avoiding being blocked by the server.

@MaximAbdulatif
Copy link

Just did that, same error 😑

Seems Twitter has restricted the connection so that all requests return a page with
"We've detected that JavaScript is disabled in your browser. Would you like to proceed to legacy Twitter?"

Indeed, this can be fixed by modifying the header dictionary in query.py from
HEADER = {'User-Agent': random.choice(HEADERS_LIST)}
to
HEADER = {'User-Agent': random.choice(HEADERS_LIST), 'X-Requested-With': 'XMLHttpRequest'}
that should fix the issue.

@Altimis
Copy link

Altimis commented Dec 18, 2020

All twitter scrapers dont seem to work anymore. I tried to use selenium to simply scrap maximum tweets between two chosen dates for given quiries. Check my work on : Scweet. let me know if you need any clarification.

@SulimanLab
Copy link

All twitter scrapers dont seem to work anymore. I tried to use selenium to simply scrap maximum tweets between two chosen dates for given quiries. Check my work on : Scweet. let me know if you need any clarification.

That is bad

@Altimis
Copy link

Altimis commented Dec 18, 2020

All twitter scrapers dont seem to work anymore. I tried to use selenium to simply scrap maximum tweets between two chosen dates for given quiries. Check my work on : Scweet. let me know if you need any clarification.

That is bad

Yeah .., but I will do my best to add more features to the resulted csv. Til now I could scrap all the important informations about tweets like username, handle, tweet text, emojis, number of likes ... . I'll try to extract more useful informations like images links...

@scherbakovdmitri
Copy link

All twitter scrapers dont seem to work anymore. I tried to use selenium to simply scrap maximum tweets between two chosen dates for given quiries. Check my work on : Scweet. let me know if you need any clarification.

snscrape still works

meticulousfan added a commit to meticulousfan/scraping-site that referenced this issue Aug 19, 2022
There was a bug in query.py taspinar/twitterscraper#296 that cause query_tweets() to fetch 0 tweets. The solution was in one of the answers in the threads
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests