Skip to content

Fix invalid characters when truncating PR description for Azure#7140

Closed
mburumaxwell wants to merge 1 commit intodependabot:mainfrom
mburumaxwell:azure-pr-desc-truncate-fix
Closed

Fix invalid characters when truncating PR description for Azure#7140
mburumaxwell wants to merge 1 commit intodependabot:mainfrom
mburumaxwell:azure-pr-desc-truncate-fix

Conversation

@mburumaxwell
Copy link
Copy Markdown
Contributor

When truncating the PR description for Azure, the string is converted from UTF8 to UTF-16, operated on then changed back to UTF-8. The last step fails resulting in a string with an invalid string which cannot be converted to JSON for the request body and the PR cannot be created.

In this PR, I change from force_encoding(...) to encode(...) which seems to solve the problem.
The spec I have added fails when using force_encoding(..) but works with encode(...) as guided by some Stack Overflow post. I may be wrong in this approach but at least someone else can help fix the issue given the test string I added in the spec.

@mburumaxwell mburumaxwell requested a review from a team as a code owner April 21, 2023 16:24
@jeffwidman
Copy link
Copy Markdown
Member

As always @mburumaxwell grateful for you finding/fixing these bugs. ❤️

When truncating the PR description for Azure, the string is converted from UTF8 to UTF-16, operated on then changed back to UTF-8.

Any idea why this conversion has to happen? Is there any reason we couldn't let it stay as UTF-8 the entire way?

I can also dig, but thought you might have already dug into this.

@mburumaxwell
Copy link
Copy Markdown
Contributor Author

Based on the commit history for the file, UTF-16 is used because of emoji and other special characters. I couldn't find any official documentation to back up the reasoning but I left it as is because the commit history shows that at some point using UTF-8 only was not sufficient.

@mburumaxwell
Copy link
Copy Markdown
Contributor Author

bump @jeffwidman

@jeffwidman
Copy link
Copy Markdown
Member

Sorry for the delay here. Thanks for the nice test case. Do you mind rebasing this now that #7487 was merged to cleanup the length handling code?

The problem this PR is tackling is still applicable, just the code location moved slightly...

@mburumaxwell
Copy link
Copy Markdown
Contributor Author

@jeffwidman it would appear that with those changes, we no longer need this PR because force_encoding now works with the description I was testing against. I will close/abandon.

@mburumaxwell mburumaxwell deleted the azure-pr-desc-truncate-fix branch August 1, 2023 14:47
@jeffwidman
Copy link
Copy Markdown
Member

Weird, looking at the diff I wouldn't have expected it to be fixed as all that changed was the location of the code, not the actual underlying calls.

But if it works, I'm not complaining. If you see this again I guess we can always reopen.

@mburumaxwell
Copy link
Copy Markdown
Contributor Author

My guess is that this may have been fixed in the framework or some dependencies since I created the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants