Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SpooledTemporaryFile exceptions on file upload #1344

Closed
abathur opened this issue Aug 18, 2018 · 13 comments · Fixed by #1409
Closed

SpooledTemporaryFile exceptions on file upload #1344

abathur opened this issue Aug 18, 2018 · 13 comments · Fixed by #1409
Milestone

Comments

@abathur
Copy link

abathur commented Aug 18, 2018

We ran into an issue earlier this year with SpooledTemporaryFile throwing exceptions on file uploads after jumping up from 0.12.x to 0.14.x. Didn't have time to take a deep look, couldn't find much online, and didn't want to sweat you all if it was our fault or came down to some incompatible dependencies. Just pinned to 0.12.2, wrote a little test that could detect the issue, and moved on.

I saw an SO question for flask-admin in my inbox tonight that sounded similar to what we saw, so I figured it's at least worth bringing up. The SO question isn't very well-phrased, so it may not be of the most use (https://stackoverflow.com/questions/51858248/attributeerror-spooledtemporaryfile-object-has-no-attribute-translate).

In case the test helps:

def test_werkzeug_spooled_temp_file(self):
    """
    When we jumped from Werkzeug 0.12.x to 0.14.x, we started running into an error on any attempt to upload a CSV:

    File "/home/vagrant/site/videohub/admin.py", line 728, in import_content
    for line in csv.DictReader(io.TextIOWrapper(new_file), restval=None):
    File "/usr/local/lib/python3.6/dist-packages/werkzeug/datastructures.py", line 2745, in __getattr__
    return getattr(self.stream, name)
    AttributeError: 'SpooledTemporaryFile' object has no attribute 'readable'

    Until/unless we debug this, we need to stay on Werkzeug versions where it doesn't break.
    """

    with self.client:
        from videohub import admin

        admin.auth_admin = lambda: True

        try:
            what = self.client.post(
                "/admin/importcontent/",
                data={"csv": (io.BytesIO(b"my file contents"), "test.csv")},
                content_type="multipart/form-data",
            )
        except AttributeError as e:
            if (
                str(e)
                == "'SpooledTemporaryFile' object has no attribute 'readable'"
            ):
                self.fail("Werkzeug/SpooledTemporaryFile still exists.")

In the SO thread I inked a set of github issues/PRs, Python issues, and another SO thread that may all be related; re-linking here in case that disappears:

@jdalegonzalez
Copy link

For what it's worth, it doesn't seem to be as simple as the version of Werkzeug. We're using 0.14.1 with python 3.6.6 on Alpine3.6 and everything is working. When we use Werkzeug 0.14.1 on python 3.7.0 with Alpine 3.8.1 then we see the problem.

@abathur
Copy link
Author

abathur commented Nov 2, 2018

It looks like python/cpython#3249 may eventually yield a fix on the Python side.

@uncojohnco
Copy link

uncojohnco commented Nov 12, 2018

@uncojohnco
Copy link

HI @abathur - I tried to create a repo, as per below, and I couldn't seem to reproduce the error =\

app.py

import os
from flask import Flask, flash, request, redirect, url_for
from werkzeug.utils import secure_filename


UPLOAD_FOLDER = './uploads'
ALLOWED_EXTENSIONS = set(['txt', 'pdf', 'png', 'jpg', 'jpeg', 'gif'])

app = Flask(__name__)
app.config['UPLOAD_FOLDER'] = UPLOAD_FOLDER


def allowed_file(filename):
    return '.' in filename and filename.rsplit('.', 1)[1].lower() in ALLOWED_EXTENSIONS


@app.route('/', methods=['GET', 'POST'])
def upload_file():

    if request.method == 'POST':
        # check if the post request has the file part
        if 'file' not in request.files:
            flash('No file part')
            return redirect(request.url)

        file = request.files['file']

        # if user does not select file, browser also
        # submit an empty part without filename
        if file.filename == '':
            flash('No selected file')
            return redirect(request.url)

        if file and allowed_file(file.filename):
            filename = secure_filename(file.filename)
            file.save(os.path.join(app.config['UPLOAD_FOLDER'], filename))
            return redirect(url_for('uploaded_file', filename=filename))


    return '''
    <!doctype html>
    <title>Upload new File</title>
    <h1>Upload new File</h1>
    <form method=post enctype=multipart/form-data>
      <input type=file name=file>
      <input type=submit value=Upload>
    </form>
    '''


from flask import send_from_directory

@app.route('/uploads/<filename>')
def uploaded_file(filename):
    return send_from_directory(app.config['UPLOAD_FOLDER'], filename)


if __name__ == '__main__':
    app.run()

Snippet to generate a large file

dd if=/dev/zero of=filename bs=1024 count=2M
Ubuntu - 18.10
Python -  3.6.7rc1

> pip freeze
Click==7.0
Flask==1.0.2
itsdangerous==1.1.0
Jinja2==2.10
MarkupSafe==1.1.0
Werkzeug==0.14.1

/cc @davidism

@abathur
Copy link
Author

abathur commented Nov 13, 2018

Don't have much time to look at the moment, but I had trouble reproducing it just now as well. I'm on the road at the moment and preparing for a flight in the morning, but I'll try to remember to see if I can reproduce the initial conditions later this week.

@jdalegonzalez Do you have some notes on your run-in with this issue that might help fill in the picture in the interim?

@abathur
Copy link
Author

abathur commented Nov 14, 2018

Apologies for the difficult reproduction case. I've confirmed that we're still seeing the issue and spent a little time this afternoon whittling it down to a reasonably terse reproduction.

In the process I did manage to spot a potential workaround. I haven't tested the workaround extensively, so I don't know if it'll cause problems of its own (especially given the report by @jdalegonzalez that platform may be a factor here).

Dependencies:

pip install flask==1.0.2 pytest pytest-flask

Test:

import pytest, io, csv
from flask import Flask, request


def fields(ob):
    return csv.DictReader(ob).fieldnames[0]


@pytest.fixture(scope="session")
def app():
    x = Flask(__name__)
    x.testing = True

    @x.route("/textio", methods=["POST"])
    def upload_file_textio():
    	"""succeeds with 0.12.2; fails with werkzeug==0.14.1;"""
        return fields(io.TextIOWrapper(request.files["csv"]))

    @x.route("/stringio", methods=["POST"])
    def upload_file_stringio():
    	"""potential workaround; succeeds for both versions"""
        return fields(io.StringIO(request.files["csv"].read().decode()))

    with x.app_context():
        yield x


@pytest.mark.parametrize("uri", ["/textio", "/stringio"])
def test_werkzeug_spooled_temp_file(client, uri):
    what = client.post(
        uri,
        data={"csv": (io.BytesIO(b"my file contents"), "test.csv")},
        content_type="multipart/form-data",
    )
    assert what.data == b"my file contents"

@uncojohnco
Copy link

HI @abathur,

I had a quick look at executing your pytest repo and at a glance the below output looks to be reproducible of the issue. I'll try having a look further in the next few days, but I'm a newbie in regards to flask, pytests and fixtures 😃

Testing started at 20:35 ...
/Users/jayCee/PycharmProjects/test_werkzeug_spooled_temp_file/venv/bin/python "/Users/jayCee/Library/Application Support/JetBrains/Toolbox/apps/PyCharm-P/ch-1/182.4505.26/PyCharm.app/Contents/helpers/pycharm/_jb_pytest_runner.py" --target test_werkzeug_spooled_temp.py::test_werkzeug_spooled_temp_file
Launching pytest with arguments test_werkzeug_spooled_temp.py::test_werkzeug_spooled_temp_file in /Users/jayCee/PycharmProjects/test_werkzeug_spooled_temp_file/test

============================= test session starts ==============================
platform darwin -- Python 3.6.6, pytest-4.0.0, py-1.7.0, pluggy-0.8.0
rootdir: /Users/jayCee/PycharmProjects/test_werkzeug_spooled_temp_file/test, inifile:
plugins: flask-0.14.0collected 2 items

test_werkzeug_spooled_temp.py F
test_werkzeug_spooled_temp.py:27 (test_werkzeug_spooled_temp_file[/textio])
client = <FlaskClient <Flask 'test_werkzeug_spooled_temp'>>, uri = '/textio'

    @pytest.mark.parametrize("uri", ["/textio", "/stringio"])
    def test_werkzeug_spooled_temp_file(client, uri):
    
        what = client.post(
            uri,
            data={"csv": (io.BytesIO(b"my file contents"), "test.csv")},
>           content_type="multipart/form-data",
        )

test_werkzeug_spooled_temp.py:34: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../venv/lib/python3.6/site-packages/werkzeug/test.py:840: in post
    return self.open(*args, **kw)
../venv/lib/python3.6/site-packages/flask/testing.py:200: in open
    follow_redirects=follow_redirects
../venv/lib/python3.6/site-packages/werkzeug/test.py:803: in open
    response = self.run_wsgi_app(environ, buffered=buffered)
../venv/lib/python3.6/site-packages/werkzeug/test.py:716: in run_wsgi_app
    rv = run_wsgi_app(self.application, environ, buffered=buffered)
../venv/lib/python3.6/site-packages/werkzeug/test.py:923: in run_wsgi_app
    app_rv = app(environ, start_response)
../venv/lib/python3.6/site-packages/flask/app.py:2309: in __call__
    return self.wsgi_app(environ, start_response)
../venv/lib/python3.6/site-packages/flask/app.py:2295: in wsgi_app
    response = self.handle_exception(e)
../venv/lib/python3.6/site-packages/flask/app.py:1741: in handle_exception
    reraise(exc_type, exc_value, tb)
../venv/lib/python3.6/site-packages/flask/_compat.py:35: in reraise
    raise value
../venv/lib/python3.6/site-packages/flask/app.py:2292: in wsgi_app
    response = self.full_dispatch_request()
../venv/lib/python3.6/site-packages/flask/app.py:1815: in full_dispatch_request
    rv = self.handle_user_exception(e)
../venv/lib/python3.6/site-packages/flask/app.py:1718: in handle_user_exception
    reraise(exc_type, exc_value, tb)
../venv/lib/python3.6/site-packages/flask/_compat.py:35: in reraise
    raise value
../venv/lib/python3.6/site-packages/flask/app.py:1813: in full_dispatch_request
    rv = self.dispatch_request()
../venv/lib/python3.6/site-packages/flask/app.py:1799: in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
test_werkzeug_spooled_temp.py:17: in upload_file_textio
    return fields(io.TextIOWrapper(request.files["csv"]))
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <FileStorage: 'test.csv' ('text/csv')>, name = 'readable'

    def __getattr__(self, name):
>       return getattr(self.stream, name)
E       AttributeError: 'SpooledTemporaryFile' object has no attribute 'readable'

../venv/lib/python3.6/site-packages/werkzeug/datastructures.py:2745: AttributeError
.                                         [100%]
=================================== FAILURES ===================================
___________________ test_werkzeug_spooled_temp_file[/textio] ___________________
...
# truncated duplicate above output ...

Environ

macOS Mojave 10.14
Python 3.6.6

pip freeze
atomicwrites==1.2.1
attrs==18.2.0
Click==7.0
Flask==1.0.2
itsdangerous==1.1.0
Jinja2==2.10
MarkupSafe==1.1.0
more-itertools==4.3.0
pluggy==0.8.0
py==1.7.0
pytest==4.0.0
pytest-flask==0.14.0
six==1.11.0
Werkzeug==0.14.1

@uncojohnco
Copy link

uncojohnco commented Nov 21, 2018

Hi,

It looks like the introduction of SpooledTemporaryFile from the below PRs is the source of this issue...

PR: #1198: Support chunked transfer encoding

PR: #1222 - Added support for platforms lacking spooled temp files


Commenting out the implementation of SpooledTemporaryFile - seems to pass the test-code #1344 (comment)

def default_stream_factory(total_content_length, filename, content_type,
                           content_length=None):
    """The stream factory that is used per default."""
    max_size = 1024 * 500
#    if SpooledTemporaryFile is not None:
#        return SpooledTemporaryFile(max_size=max_size, mode='wb+')
    if total_content_length is None or total_content_length > max_size:
        return TemporaryFile('wb+')
return BytesIO()

It looks like from @abathur links - there is an issue with the SpooledTemporaryFile class as it missing the implementation of "assumed" abstract attributes of IOBase. Also, this would impact cases of files that are less that 512 kilobytes(?)

Applying the below monkey patch to FormDataParser.default_stream_factory looks to pass the test-code #1344 (comment):

def default_stream_factory(total_content_length, filename, content_type,
                           content_length=None):
    """The stream factory that is used per default."""
    max_size = 1024 * 500
    if SpooledTemporaryFile is not None:
        monkeypatch_SpooledTemporaryFile = SpooledTemporaryFile(max_size=max_size, mode='wb+')
        monkeypatch_SpooledTemporaryFile.readable = monkeypatch_SpooledTemporaryFile._file.readable
        monkeypatch_SpooledTemporaryFile.writable = monkeypatch_SpooledTemporaryFile._file.writable
        monkeypatch_SpooledTemporaryFile.seekable = monkeypatch_SpooledTemporaryFile._file.seekable
        return monkeypatch_SpooledTemporaryFile
    if total_content_length is None or total_content_length > max_size:
        return TemporaryFile('wb+')
    return BytesIO()

or maybe...

class SpooledTemporaryFile_Patched(SpooledTemporaryFile): 
    """Patch for `SpooledTemporaryFile exceptions on file upload #1344`
      - SpooledTemporaryFile does not fully satisfy the abstract for IOBase - https://bugs.python.org/issue26175 
      - bpo-26175: Fix SpooledTemporaryFile IOBase abstract - https://github.com/python/cpython/pull/3249

     TODO: Remove patch once `bpo-26175: Fix SpooledTemporaryFile IOBase abstract` is resolved..."""

    def readable(self):
        return self._file.readable

    def writable(self):
        return self._file.writable

    def seekable(self):
        return self._file.seekable

...

def default_stream_factory(total_content_length, filename, content_type, content_length=None):
    """The stream factory that is used per default.

    Patch: `SpooledTemporaryFile exceptions on file upload #1344`, Remove once `bpo-26175: Fix SpooledTemporaryFile IOBase abstract` is resolved...
      - SpooledTemporaryFile does not fully satisfy the abstract for IOBase - https://bugs.python.org/issue26175 
      - bpo-26175: Fix SpooledTemporaryFile IOBase abstract - https://github.com/python/cpython/pull/3249"""
    max_size = 1024 * 500
    if SpooledTemporaryFile is not None:
        # TODO: Remove patch once `bpo-26175: Fix SpooledTemporaryFile IOBase abstract` is resolved...
        SpooledTemporaryFile = SpooledTemporaryFile_Patched
        return SpooledTemporaryFile_Patched(max_size=max_size, mode='wb+')
    if total_content_length is None or total_content_length > max_size:
        return TemporaryFile('wb+')
    return BytesIO()

What would be the best course of action?

  • Implement a monkey patch for SpooledTemporaryFile? (There is probably a better way to do this...)
  • Avoid the implementation of SpooledTemporaryFile, until a fix is released in a later ver of python and add a redirection depending on the python ver?
  • I'm still a newbie to this domain, I'm guessing there are other options? :)

@pallets pallets deleted a comment from uncojohnco Nov 27, 2018
@davidism
Copy link
Member

davidism commented Nov 27, 2018

We used to not use SpooledTemporaryFile, but it seemed like a built-in way to do what the original code was doing. It's caused a lot of problems since. We should probably just go back to not using SpooledTemporaryFile.

@mjpieters
Copy link

😞 I note that bpo-26175 has a pull request in progress now, at least, see python/cpython#3249

@davidism
Copy link
Member

davidism commented Nov 29, 2018

Chat with @mjpieters brought up a different idea. We could change FileStorage.__getattr__ to try stream._file if the attr doesn't exist on stream but it has a _file attribute.

class FileStorage:
    def __getattr__(self, name):
        try:
            return getattr(self.stream, name)
        except AttributeError:
            if hasattr(self.stream, "_file"):
                return getattr(self.stream._file, name)
            raise

@bl4ckb1rd
Copy link

Hi, today upgrade flask to 1.0.2 with need werkzeug >= 0.14

I've this same error with the last version of werkzeug (0.14.1) with python 3.6.7 (that come with ubuntu 18.04):

AttributeError: 'SpooledTemporaryFile' object has no attribute 'readable'

Do you plan to port the fix to version 0.14?

Thank you.

@davidism
Copy link
Member

The next version will be 0.15. Please follow this repo, our blog, or https://twitter.com/PalletsTeam for release announcements.

kgaughan added a commit to kgaughan/docserver that referenced this issue Mar 4, 2019
There's an issue with file uploads that makes them incompatible with
ZipFile. 0.12.2 is unaffected, and 0.15 will have the fix.

See: pallets/werkzeug#1344
abrasive added a commit to abrasive/hostthedocs that referenced this issue Aug 21, 2019
In newer versions of Werkzeug the uploaded file's stream is a
SpooledTemporaryFile. That class doesn't implement the .seekable method,
which is called by the Python zipfile library, and everything falls
over.

This patch avoids the issue by using the FileStorage class directly; as
of Werkzeug 0.15 it proxies this method correctly to whatever
object is backing the SpooledTemporaryFile. (At least according to the
relevant GH issue: pallets/werkzeug#1344 ).
abrasive added a commit to abrasive/hostthedocs that referenced this issue Aug 21, 2019
In newer versions of Werkzeug the uploaded file's stream is a
SpooledTemporaryFile. That class doesn't implement the .seekable method,
which is called by the Python zipfile library, and everything falls
over.

This patch avoids the issue by using the FileStorage class directly; as
of Werkzeug 0.15 it proxies this method correctly to whatever
object is backing the SpooledTemporaryFile. (At least according to the
relevant GH issue: pallets/werkzeug#1344 ).
abrasive added a commit to abrasive/hostthedocs that referenced this issue Aug 22, 2019
In newer versions of Werkzeug the uploaded file's stream is a
SpooledTemporaryFile. That class doesn't implement the .seekable method,
which is called by the Python zipfile library, and everything falls
over.

This patch avoids the issue by using the FileStorage class directly; as
of Werkzeug 0.15 it proxies this method correctly to whatever
object is backing the SpooledTemporaryFile. (At least according to the
relevant GH issue: pallets/werkzeug#1344 ).
J535D165 added a commit to asreview/asreview that referenced this issue Oct 23, 2020
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 13, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
6 participants