-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Downloading from Areena's Audio pages returns Document empty #261
Comments
Thanks for the report and the backtrace! Based on the error message, it looks like yle-dl got an empty HTTP response from the Areena server instead of the web page content it expected. Can you try downloading again just to make sure it wasn't some kind of temporary problem at Areena? I'm asking this because downloading your example program works for me on Linux. Are you able to download TV episodes or do you get a similar error also on videos? |
I tried over a period of days and before and after re-installation, same response every time. Video downloads, both individual and series work just fine. wget produces a complete page on download, only lxml looks like it's playing up. Info that may or may not be helpful is that I'm running yle-dl on Mac OS X 10.13. |
Another strange thing is that the backtrace you posted shows a call to I'm really struggling to come up with an explanation why this would happen. The only reason I can think of is that you have an invisible control character somewhere in the URL. For example: |
No extra characters are present there. I've tried both copy-pasting the url and also writing it by hand since I expected there would be something there. Nothing has worked so far. This is actually odd, now that I look different requests, the traceback is missing the and has this:
This happens with new style url |
Can you try with the latest version from the Github master branch? I fixed one issue that could potentially cause this problem. |
Thanks, that solves the issue. It still gives me One small thing here still stands: opening the playlist page |
Thanks for testing. It seems that I still didn't manage to fix the error properly since it's still showing the warning and not downloading the full playlist. I'll try to figure out a more correct fix but it's challending because I can test it myself. If any Mac user with Python debuging skills wants to dive into this, help would be appreciated. :) |
Just bumped into this too (with yle-dl 20211213 says "WARNING: HTML parsing error: Document is empty" but downloads one episode anyway. I can try to take a look :) EDIT: curiously, visiting |
This is now fixed thanks to @akx ! |
I was trying to download https://areena.yle.fi/audio/1-50674174.
Using OS X this returned
I've updated everything, checked that everything's in place, and tested with both Homebrew and pip.
Downloading video works fine so I would assume this is an issue with lxml handling Areena's new audio pages.
Finding single audio episode addresses didn't work either, it returned this same.
The text was updated successfully, but these errors were encountered: