Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix EOFError when calling the Remote WebDriver download_file method #14031

Merged
merged 1 commit into from
Jun 4, 2024

Conversation

millin
Copy link
Contributor

@millin millin commented May 24, 2024

User description

Fix #13957, #13956

Description

Motivation and Context

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist

  • I have read the contributing document.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

PR Type

Bug fix


Description

  • Fixed an EOFError in the download_file method of Remote WebDriver by using a temporary directory to handle zip files.
  • Added tempfile import to manage temporary directories.
  • Modified the file writing and extraction logic to ensure proper handling of zip files.

Changes walkthrough 📝

Relevant files
Bug fix
webdriver.py
Fix EOFError in `download_file` method by using temporary directory

py/selenium/webdriver/remote/webdriver.py

  • Added tempfile import.
  • Modified download_file method to use a temporary directory for
    handling zip files.
  • Changed file writing and extraction logic to handle zip files
    correctly.
  • +7/-5     

    💡 PR-Agent usage:
    Comment /help on the PR to get a list of all available PR-Agent tools and their descriptions

    @CLAassistant
    Copy link

    CLAassistant commented May 24, 2024

    CLA assistant check
    All committers have signed the CLA.

    Copy link
    Contributor

    PR Review 🔍

    ⏱️ Estimated effort to review [1-5]

    2, because the changes are localized to a specific method and involve standard Python libraries. The logic is straightforward and the modifications are minimal, making it relatively easy to review.

    🧪 Relevant tests

    No

    ⚡ Possible issues

    Possible Bug: The PR does not handle exceptions that may occur during file operations or zip file extraction. It would be beneficial to add error handling to manage scenarios where file writing or extraction fails.

    🔒 Security concerns

    No

    Copy link
    Contributor

    PR Code Suggestions ✨

    CategorySuggestion                                                                                                                                    Score
    Possible bug
    Add the missing import for the base64 module

    Add an import statement for the base64 module, which is necessary for decoding the file
    contents.

    py/selenium/webdriver/remote/webdriver.py [23]

     import tempfile
    +import base64
     
    Suggestion importance[1-10]: 10

    Why: The base64 module is used in the new code for decoding file contents but is not imported, which would cause a runtime error.

    10
    Add the missing import for the zipfile module

    Add an import statement for the zipfile module, which is necessary for extracting the zip
    file.

    py/selenium/webdriver/remote/webdriver.py [23]

     import tempfile
    +import zipfile
     
    Suggestion importance[1-10]: 10

    Why: The zipfile module is used in the new code for extracting files but is not imported, which would cause a runtime error.

    10
    Best practice
    Add error handling to the file download and extraction process

    Add error handling for the file download and extraction process to ensure that any issues
    are properly caught and logged.

    py/selenium/webdriver/remote/webdriver.py [1151-1157]

    -with tempfile.TemporaryDirectory() as tmp_dir:
    -    zip_file = os.path.join(tmp_dir, file_name + ".zip")
    -    with open(zip_file, "wb") as file:
    -        file.write(base64.b64decode(contents))
    +try:
    +    with tempfile.TemporaryDirectory() as tmp_dir:
    +        zip_file = os.path.join(tmp_dir, file_name + ".zip")
    +        with open(zip_file, "wb") as file:
    +            file.write(base64.b64decode(contents))
     
    -    with zipfile.ZipFile(zip_file, "r") as zip_ref:
    -        zip_ref.extractall(target_directory)
    +        with zipfile.ZipFile(zip_file, "r") as zip_ref:
    +            zip_ref.extractall(target_directory)
    +except Exception as e:
    +    logging.error(f"Failed to download and extract file {file_name}: {e}")
     
    Suggestion importance[1-10]: 8

    Why: Adding error handling would significantly improve the robustness of the file handling operations, making the code safer and more reliable.

    8
    Possible issue
    Ensure the target directory exists before extracting files

    Ensure that the target_directory exists before attempting to extract files into it.

    py/selenium/webdriver/remote/webdriver.py [1156-1157]

    +os.makedirs(target_directory, exist_ok=True)
     with zipfile.ZipFile(zip_file, "r") as zip_ref:
         zip_ref.extractall(target_directory)
     
    Suggestion importance[1-10]: 7

    Why: Ensuring the target directory exists before extraction is a good practice to avoid runtime errors, improving the code's reliability.

    7

    @pujagani pujagani added the python Pull requests that update Python code label May 28, 2024
    @titusfortner titusfortner added C-py and removed python Pull requests that update Python code labels May 28, 2024
    @titusfortner
    Copy link
    Member

    Can you write a test for this that fails with current code and passes with the new code?

    Copy link

    codecov bot commented Jun 2, 2024

    Codecov Report

    Attention: Patch coverage is 85.71429% with 1 lines in your changes are missing coverage. Please review.

    Project coverage is 58.73%. Comparing base (98c6eb0) to head (6cb017c).
    Report is 11 commits behind head on trunk.

    Files Patch % Lines
    py/selenium/webdriver/remote/webdriver.py 85.71% 1 Missing ⚠️
    Additional details and impacted files
    @@            Coverage Diff             @@
    ##            trunk   #14031      +/-   ##
    ==========================================
    + Coverage   58.72%   58.73%   +0.01%     
    ==========================================
      Files          86       86              
      Lines        5298     5300       +2     
      Branches      227      227              
    ==========================================
    + Hits         3111     3113       +2     
      Misses       1960     1960              
      Partials      227      227              

    ☔ View full report in Codecov by Sentry.
    📢 Have feedback on the report? Share it here.

    @titusfortner
    Copy link
    Member

    Not sure why it is failing this test, rerunning...

    @millin
    Copy link
    Contributor Author

    millin commented Jun 4, 2024

    @titusfortner
    This error occurs only when trying to download a file larger than the shutil.COPY_BUFSIZE.
    The cause is that the name of the zip-file is the same as the name of the destination file.

    I could write a test, but that would require adding a larger file.

    @titusfortner titusfortner merged commit 635f7e2 into SeleniumHQ:trunk Jun 4, 2024
    16 checks passed
    @millin millin deleted the patch-1 branch June 4, 2024 12:40
    sandeepsuryaprasad pushed a commit to sandeepsuryaprasad/selenium that referenced this pull request Oct 29, 2024
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Projects
    None yet
    Development

    Successfully merging this pull request may close these issues.

    [🐛 Bug]: Example on GitHub Error about Remote WebDriver
    4 participants