Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Download from routeviews via HTTP additional to FTP #64

Open
ghost opened this issue May 14, 2020 · 13 comments
Open

Download from routeviews via HTTP additional to FTP #64

ghost opened this issue May 14, 2020 · 13 comments

Comments

@ghost
Copy link

ghost commented May 14, 2020

Currently, the download from routeviews.org happens via FTP in the pyasn_util_download.py. However, in order to use HTTP proxies (mandatory in some requirements), HTTP is preferred or only possible.

@hadiasghari
Copy link
Owner

@wagner-certat if you have any PRs on this I'd be happy to merge :)

@ghost
Copy link
Author

ghost commented Sep 4, 2020

In IntelMQ we now integrated the pyasn database update functionality, so there's no overlap of interest / potential of synergies. I therefore doubt that I'll provide a PR for this in the near future :/

@hadiasghari
Copy link
Owner

No problem. I'll close this for now, as I think some might also simply use wget. If there is interest in the future we can revisit this issue. Thanks!

@mansweet
Copy link
Contributor

I think I have an interest in this feature since where I'm trying to do the download from is firewalled off from ftp. I'll poke around and try to submit a PR @hadiasghari

@mansweet
Copy link
Contributor

mansweet commented Aug 26, 2021

@hadiasghari I think i've found a solution to do this with.

  • I'm trying to keep with what I interpret the spirit of the project as to not include any non-builtin python libs for getting the file downloaded.
  • I'll keep it simple and just add functionality to download the latest rib file, if that's alright
  • I'll add some cli args to specify this option

Do you have any other requests as to how this feature is implemented?

Lastly, since I'm currently working on this issue, would you kindly mind re-opening the issue? If it's not fixed within a month, I'd say go ahead and close it.

For record keeping, I'm working on this in my forked repo in a feature branch https://github.com/mansweet/pyasn/tree/add-http-download-method

@mansweet
Copy link
Contributor

mansweet commented Aug 26, 2021

@hadiasghari I've got my changes ready over at: https://github.com/hadiasghari/pyasn/compare/master...mansweet:add-http-download-method?expand=1

I need to do some git-cleanup since it seems that there are changes from a previous PR I submitted to your repo looped into this one. I suppose those might be automagically resolved if you approve and merge PR #69 . Let me know what you want to do with my previous PR, then I'll (possibly clean up my branch depending on the action) and submit a PR and we can discuss it publicly there

Thank you for your time, consideration and maintenance of this project!

@hadiasghari
Copy link
Owner

Hi @mansweet, thank you for the PR, I'll need a few days to get to this.

@hadiasghari
Copy link
Owner

@mansweet thank you for the commits. I approved and merged PR #69. Please send a new PR so I can run tests/check and then merge/approve.

Note, regarding the feature list you mention, I agree with all, except this I'll keep it simple and just add functionality to download the latest rib file, if that's alright. I feel this would make it unnecessary complicated that the FTP option can do different dates but the HTTP option only the latest. Would it be too difficult to allow different dates? (Since the date parsing option logic is already implemented).

Additionally, will it also support https downloads?

Thanks :)

@mansweet
Copy link
Contributor

mansweet commented Aug 31, 2021

@hadiasghari thanks for merging #69 and re-opening this issue.

I agree with your request and think that there should be uniform functionality for different options. However, correct me if I'm wrong, but it seems that the FTP path is only capable of downloading the latest, while the http (as of now) can download specific dates (with the --dates-from-file CLI arg). I think my feature here is just adding functionality to download the latest from the http source without requiring the user to submit a file of dates in order to use the http path.

It might make sense then in a separate scope to to simplify the interface overall such that a user can:

  • choose http vs ftp (or perhaps even https)
  • choose that it just fetches the latest, or some specific dates for whatever download method.
  • Choose the ip-version(s)

Now, as for the https downloads you've requested, I can look into that. Can you provide me with what the source is that I should try to fetch the ribs from? https://archive.routeviews.org does not resolve.

Additionally, another incongruence I have is that the http method only allows for IPv4 downloads (while FTP can do any of them). Would you be able to kindly point me to the IPv6 source I should fetch via http?

The PR for the feature as is is found at #72

Also, just an interesting find. Today is the 31st of August. It looks like routeviews makes the directory for the next month perhaps a little bit early without populating it. Just point this out in case you find some strange corner cases in the future!
Screen Shot 2021-08-31 at 3 24 21 PM

@mansweet
Copy link
Contributor

mansweet commented Sep 9, 2021

hi @hadiasghari , any update on merging this PR?

@ghost
Copy link
Author

ghost commented Sep 9, 2021

Also, just an interesting find. Today is the 31st of August. It looks like routeviews makes the directory for the next month perhaps a little bit early without populating it. Just point this out in case you find some strange corner cases in the future!

That hit us in certtools/intelmq#2088 as well

@hadiasghari
Copy link
Owner

Hi @mansweet, thanks for the PR. I posted an update to the PR conversation, please see that.

@mansweet @wagner-certat regarding the last day of month edge case (which is probably a change on Routeviews side), we could go back one day if nothing is found in the current day.

@ghost
Copy link
Author

ghost commented Sep 17, 2021

The monthly dirs are sorted here:

    months = sorted(ftp.nlst(archive_root), reverse=True)  # e.g. 'route-views6/bgpdata/2016.12'

We could also always use the current month, instead of just the newest directory.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants