Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not able to properly find files to download #24

Open
blues-clues-security opened this issue Jun 3, 2024 · 0 comments
Open

Not able to properly find files to download #24

blues-clues-security opened this issue Jun 3, 2024 · 0 comments

Comments

@blues-clues-security
Copy link

In search.py, results_handler function is not properly pulling links from the Google search results (haven't tested Bing yet). I've been able to somewhat fix the issue by changing it to:

def results_handler(self, link):
        url = str(link.get('href'))
        parsed_url = urllib.parse.urlparse(url)
        query_params = urllib.parse.parse_qs(parsed_url.query)
        actual_url = query_params.get('url', [None])[0]  # Extract 'url' from query parameters
        
        if actual_url and self.regex.match(actual_url):
            self.results.append(actual_url)
            logging.debug('Added URL: {}'.format(actual_url))
        elif self.regex.match(url):  # Fallback in case the URL is not in query params
            self.results.append(url)
            logging.debug('Added URL: {}'.format(url))

This is more accurately finding results (testing by manual performing the dorks). There are other issues I've encountered like .pdf's not being downloaded due to being flagged as a bot, but that's a different issue. Would you prefer a pull request with the changes?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant