Skip to content

Conversation

@rkk-ableton
Copy link

@rkk-ableton rkk-ableton commented Aug 24, 2021

Summary

This PR brings Ableton's cucumber-cpp up to date with a change allowing for the use of non-basic latin characters in step definitions.

Details

cucumber/cucumber-cpp#224 was merged, allowing for step definitions to use non-basic latin characters. This enables us to write acceptance tests to protect against regressions of our support for languages whose character sets are encoded in UTF-8.

Motivation and Context

We want the ability to test scenarios which involve non-basic latin characters.

How Has This Been Tested?

The code change was merged upstream following the cucumber-cpp project's testing process.

Types of changes

  • Bug fix (non-breaking change which fixes an issue).
  • New feature (non-breaking change which adds functionality).
  • Breaking change (fix or feature that would cause existing functionality to not work as expected).

Checklist:

  • It is my own work, its copyright is implicitly assigned to the project and no substantial part of it has been copied from other sources (including Stack Overflow). In rare occasions this is acceptable, like in CMake modules where the original copyright information should be kept.
  • I'm using the same code standards as the existing code (indentation, spacing, variable naming, ...).
  • I've added tests for my code.
  • I have verified whether my change requires changes to the documentation
  • My change either requires no documentation change or I've updated the documentation accordingly.
  • My branch has been rebased to master, keeping only relevant commits.

K. Mlynarczyk and others added 10 commits June 21, 2019 11:08
json_spirit's escaping of multibyte characters creates bugs in the WireProtocol
which prevent usage of valid UTF-8 encoded characters in step definitions
relying on RegEx.

The new tests:
/features/specific/wire_encoding.feature
/tests/integration/WireProtocolTest.cpp
/tests/unit/RegexTest.cpp
This change updates json-spirit to the latest public version:
https://www.codeproject.com/KB/recipes/JSON_Spirit/json_spirit_v4.08.zip

4.08 adds support for a raw_utf8 option when writing a JSON string.
Previously, multibyte characters were being escaped when being sent from
cucumber-cpp to cucumber-ruby. Because cucumber-ruby's wire decoder
does not properly decode escaped character sequences, this would crash
cucumber-ruby.
This modifies the WireResponseEncoder to always use the raw_utf8
option provided by the new version of json_spirit.

According to the IETF RFC8259:
"JSON text exchanged between systems that are not part of a closed
   ecosystem MUST be encoded using UTF-8 [RFC3629]."
https://tools.ietf.org/html/rfc8259
`cucumber-ruby` expects position values which are based on the index of
the codepoint instead of the index of the code unit. This change modifies
the value returned to `cucumber-ruby`.

Prior to this change, the RegexSubMatch's position, which was correct in
terms of a code unit array, would cause an `index out of string` error and
crash cucumber-ruby when pretty-printing the results of a test.

This commit also ammends the added tests to demonstrate the corrected
behavior.
…-upstream

Support step definitions with multi-byte characters
Copy link

@ala-ableton ala-ableton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍🏻
When you want to merge, please do not press the "Merge pull request" button (and do not use @ablbot merge). Instead, push edf5a29 directly to ableton-master.

@rkk-ableton rkk-ableton merged commit edf5a29 into ableton-master Aug 24, 2021
@ala-ableton ala-ableton deleted the sync-upstream branch August 24, 2021 13:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants