Skip to content

Commit

Permalink
[BEAM-10785] Change RowAsDictJsonCoder to not ensure ASCII while enco…
Browse files Browse the repository at this point in the history
…ding (#22312)

* Change RowAsDictJsonCoder to not ensure ASCII while encoding

Signed-off-by: Seunghwan Hong <[email protected]>

* Format code, Refactor test for readability

Signed-off-by: Seunghwan Hong <[email protected]>

Signed-off-by: Seunghwan Hong <[email protected]>
Co-authored-by: Pablo <[email protected]>
  • Loading branch information
harrydrippin and pabloem authored Sep 30, 2022
1 parent cc623db commit 31dab81
Show file tree
Hide file tree
Showing 3 changed files with 12 additions and 1 deletion.
1 change: 1 addition & 0 deletions CHANGES.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,7 @@
* Support for X source added (Java/Python) ([#X](https://github.com/apache/beam/issues/X)).
* Decreased TextSource CPU utilization by 2.3x (Java) ([#23193](https://github.com/apache/beam/issues/23193)).
* Fixed bug when using SpannerIO with RuntimeValueProvider options (Java) ([#22146](https://github.com/apache/beam/issues/22146)).
* Fixed issue for unicode rendering on WriteToBigQuery ([#10785](https://github.com/apache/beam/issues/10785))

## New Features / Improvements

Expand Down
5 changes: 4 additions & 1 deletion sdks/python/apache_beam/io/gcp/bigquery_tools.py
Original file line number Diff line number Diff line change
Expand Up @@ -1538,7 +1538,10 @@ def encode(self, table_row):
# to the programmer that they have used NAN/INF values.
try:
return json.dumps(
table_row, allow_nan=False, default=default_encoder).encode('utf-8')
table_row,
allow_nan=False,
ensure_ascii=False,
default=default_encoder).encode('utf-8')
except ValueError as e:
raise ValueError(
'%s. %s. Row: %r' % (e, JSON_COMPLIANCE_ERROR, table_row))
Expand Down
7 changes: 7 additions & 0 deletions sdks/python/apache_beam/io/gcp/bigquery_tools_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -1052,6 +1052,13 @@ def test_invalid_json_inf(self):
def test_invalid_json_neg_inf(self):
self.json_compliance_exception(float('-inf'))

def test_ensure_ascii(self):
coder = RowAsDictJsonCoder()
test_value = {'s': '🎉'}
output_value = b'{"s": "\xf0\x9f\x8e\x89"}'

self.assertEqual(output_value, coder.encode(test_value))


@unittest.skipIf(HttpError is None, 'GCP dependencies are not installed')
class TestJsonRowWriter(unittest.TestCase):
Expand Down

0 comments on commit 31dab81

Please sign in to comment.