Skip to content

File not found on crawl method #248

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
eliasdabbas opened this issue Nov 24, 2022 · 4 comments
Closed

File not found on crawl method #248

eliasdabbas opened this issue Nov 24, 2022 · 4 comments

Comments

@eliasdabbas
Copy link
Owner

  Hi all,

I'm following the documentation with this line of code

adv.crawl('https://example.com', 'my_output_file.jl', follow_links=True)

But it returns this error:

FileNotFoundError: [WinError 2] The system cannot find the file specified

Even though my directory looks like this:

- SEO.py
- my_output_file.jl

Here is the complete trace:

Traceback (most recent call last):
  File "c:/Users/Henrique/Desktop/SEO/SEO.py", line 6, in <module>
    adv.crawl('https://example.com', 'my_output_file.jl', follow_links=True)
  File "C:\Users\Henrique\AppData\Roaming\Python\Python38\site-packages\advertools\spider.py", line 971, in crawl
    subprocess.run(command)
  File "C:\Python38\lib\subprocess.py", line 489, in run
    with Popen(*popenargs, **kwargs) as process:
  File "C:\Python38\lib\subprocess.py", line 854, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "C:\Python38\lib\subprocess.py", line 1307, in _execute_child
    hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
FileNotFoundError: [WinError 2] The system cannot find the file specified

As you can see it doesn't specify which file was not found but I assume it is the output file.

Any help is greatly appreciated!

Originally posted by @henriquearaujo-98 in #247

@eliasdabbas eliasdabbas changed the title Hi all, File not found on crawl method Nov 24, 2022
@eliasdabbas
Copy link
Owner Author

Thanks for reporting @henriquearaujo-98 (I moved it here as this is an issue, and discussions are more for new features, best practices, etc.)

It's not clear why you are getting this issue.

Can you please share the full code in the file c:/Users/Henrique/Desktop/SEO/SEO.py ? Maybe there is something else causing this issue.

Also, can you please share the versions of packages you are using (advertools, pandas, and scrapy) and Python version?

@henriquearaujo-98
Copy link

henriquearaujo-98 commented Nov 24, 2022

Hi, thank you for your quick reply.

My code looks like this

import advertools as adv
import pandas as pd

adv.crawl('https://example.com', 'my_output_file.jl', follow_links=True)

I'm running it on Python 3.8.5 and advertools version 0.13.2 downloaded through the pip3 package manager

@eliasdabbas
Copy link
Owner Author

Can't see anything wrong with the code. I tested it on Linux and Mac, works fine. It might be referring to the spider file that cannot be found.

Can you please try to create a virtual environment, install advertools within the environment and see if it works?

Something like this, but please check the docs if needed

python3 -m venv my-env
my-env\Scripts\activate.bat
pip install advertools
python
import advertools as adv
adv.crawl('https://example.com', 'my_output_file.jl', follow_links=True)

@eliasdabbas
Copy link
Owner Author

@henriquearaujo-98 Just curious, if you tried this, and if it worked?

Please let me know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants