Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes #4949 UnicodeEncodeError in upload.models.update_from_session #4950

Closed
wants to merge 1 commit into from

Conversation

capooti
Copy link
Member

@capooti capooti commented Sep 24, 2019

Fixes #4949

  • Confirm you have read the contribution guidelines
  • You have sent a Contribution Licence Agreement (CLA) as necessary (not required for small changes, e.g., fixing typos in documentation)
  • Make sure the first PR targets the master branch, eventual backports will be managed later. This can be ignored if the PR is fixing an issue that only happens in a specific branch, but not in newer ones.

The following are required only for core and extension modules (they are welcomed, but not required, for contrib modules):

  • There is a ticket in https://github.com/GeoNode/geonode/issues describing the issue/improvement/feature (a notable exemptions is, changes not visible to end users)
  • [] The issue connected to the PR must have Labels and Milestone assigned
  • PR for bug fixes and small new features are presented as a single commit
  • Commit message must be in the form "[Fixes #<issue_number>] Title of the Issue"
  • New unit tests have been added covering the changes, unless there are explanation on why the tests are not necessary/implemented
  • This PR passes all existing unit tests (test results will be reported by travis-ci after opening this PR)
  • This PR passes the QA checks: flake8 geonode
  • Commits changing the settings, UI, existing user workflows, or adding new functionality, need to include documentation updates

Submitting the PR does not require you to check all items, but by the time it gets merged, they should be either satisfied or inapplicable.

@cla-bot cla-bot bot added the cla-signed CLA Bot: community license agreement signed label Sep 24, 2019
@codecov
Copy link

codecov bot commented Sep 24, 2019

Codecov Report

Merging #4950 into master will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master    #4950   +/-   ##
=======================================
  Coverage   59.87%   59.87%           
=======================================
  Files         231      231           
  Lines       12431    12431           
  Branches     1799     1799           
=======================================
  Hits         7443     7443           
  Misses       4374     4374           
  Partials      614      614

Copy link
Member

@afabiani afabiani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uhm, I'd prefer to ignore unreadable characters instead of getting rid completely of user infos.
Moreover, it is useful to add tests for such cases. We don't have much regarding encoding.
Give me some time to update a bit this PR and maybe, with the occasion, trying to understand if there are other points of failure due to special characters present on user name. This is not the first time we face a similar issue.

@capooti
Copy link
Member Author

capooti commented Sep 24, 2019

@afabiani the error is caused by the check being done. Removing that two lines won't remove the information from the upload_session and Python handles it without problems even with not ascii characters in my test.
I agree to have a test for this: it would be enough to have a profile fixture having not ascii characters in first_name/last_name for this purpose

@t-book
Copy link
Contributor

t-book commented Sep 24, 2019

... or in case it is good for a thing we currently do not see how about keep it but fail silently?

try:
    upload_session.user.first_name = u'{}'.format(upload_session.user.first_name).decode("utf-8", "replace")
    upload_session.user.last_name = u'{}'.format(upload_session.user.last_name).decode("utf-8", "replace")
except UnicodeEncodeError:
    pass

@afabiani
Copy link
Member

Well, interpreting the code it seems that the original intention was to pickle the user info into the database. The existing check, tries to remove "unpickable" utf-8 characters. Removing them or doing a silent check, means avoid saving completely that info into the DB.

Now, it is possible that it is not used anywhere else later on, but it is worth to double check IMHO.

@afabiani
Copy link
Member

@capooti --> #4958

@capooti
Copy link
Member Author

capooti commented Sep 27, 2019

thanks @afabiani I just merged it
In my case there wasn't the need to sanitize the strings as you did, that is why I was considering just to remove those two lines.
If you have time, could you run CI after removing the two lines as a test?

@afabiani afabiani closed this Sep 30, 2019
@t-book
Copy link
Contributor

t-book commented Sep 30, 2019

Ref: #4935
(Test if this can help here as well)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla-signed CLA Bot: community license agreement signed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

UnicodeEncodeError in upload.models.update_from_session method when using Importer
3 participants