mturk-emoji

Python + JavaScript workaround for mturk's rejection of CSV files with Emoji.

Use Case

You want to upload a CSV HIT file to Mechanical Turk that contains Emoji characters, and have Mechanical Turk render the Emoji in an HTML HIT template. But as of early 2018, Amazon's Mechanical Turk service rejects CSV HIT files that contain 4-byte UTF-8 characters - such as Emoji.

Mechanical Turk rejects CSV files with Emoji characters with messages with the format:

Errors
Line 55: Unsupported character found: 😀
Line 226: Unsupported character found: 😱

Implemented Solution

This repository contains Python code in encode_emoji.py for converting all 4-byte UTF-8 characters into HTML spans with the 4 bytes stored as JSON data. Sample usage:

>>> import codecs
>>> from encode_emoji import replace_emoji_characters
>>> grin_emoji = codecs.open('grin_emoji.txt', encoding='utf-8').read().strip()
>>> replace_emoji_characters(grin_emoji)
u"<span class='emoji-bytes' data-emoji-bytes='[240, 159, 152, 128]'></span>"
>>>

The repository also contains JavaScript code in decode_emoji.js that finds all HTML spans created by encode_emoji.py and inserts the Emoji character into the span, so that this tag:

<span class='emoji-bytes' data-emoji-bytes='[240, 159, 152, 128]'></span>

is replaced with:

<span class='emoji-bytes' data-emoji-bytes='[240, 159, 152, 128]'>😀</span>

which is just rendered as the original Emoji:

😀

Please note that this JavaScript code depends on the jQuery library. Sample usage:

<script src="https://code.jquery.com/jquery-3.3.1.js"
        integrity="sha256-2Kok7MbOyxpgUVvAk/HJ2jigOSYS2auK4Pfzbm7uH60="
        crossorigin="anonymous"></script>
<script src="decode_emoji.js"></script>

<script>
$(document).ready(function() {
  $('span.emoji-bytes').each(displayEmoji);
});
</script>

Please see the index.html file for more details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

mturk-emoji

Use Case

Implemented Solution

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.gitignore		.gitignore
README.md		README.md
decode_emoji.js		decode_emoji.js
emoji.txt		emoji.txt
encode_emoji.py		encode_emoji.py
grin_emoji.txt		grin_emoji.txt
index.html		index.html

charman/mturk-emoji

Folders and files

Latest commit

History

Repository files navigation

mturk-emoji

Use Case

Implemented Solution

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages