Skip to content

Emoji-Emotion dataset: An emoji-centric NLP resources based on Twitter Data

License

Notifications You must be signed in to change notification settings

abushoeb/EmoTag

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

EmoTag1200 dataset for emoji emotion

EmoTag 👍 😄

An emoji-centric NLP resources based on Twitter Data

About

EmoTag is a collection of resources for analyzing the emotion and sentiment of Emojis as well as Tweets written in English. The name EmoTag indicates its usefulness in exploiting emojis for emotional tagging.

EmoTag Resources

  • Baseline Emoji Emotion Scores: 1200 Emoji-Emotion pairs annotated by humans. It contains emotion scores ranging from 0 to 1 for 150 most popular Twitter emojis for 8 emotion classes (i.e. anger, anticipation, disgust, fear, joy, sadness, surprise, and trust). [Download Scores] [Download Details]

  • Interpretable Word Vectors: A 620-dimensional vector representation of words and emojis trained on ~20.8 million emoji-centric Twitter data. [Download]

  • Raw Tweets: This contains Tweet IDs of ~20.8 million tweets used in our experiments. Please contact us if you need additional samples. [Download All Tweet IDs]

  • Word-Emoji Co-occurrence Frequencies: This lexicon provides word-emoji co-occurrence frequencies observed in our dataset. [Download]

  • Emoji-Emoji Co-occurrence Frequencies: This is the subset of the previous lexicon (i.e. Word-Emoji co-occurrences) which contains only emoji-emoji co-occurrence counts observed in our dataset. This would be useful if someone quickly wants to find co-occurring emojis. [Download]

Relevant Papers and Citation

Please cite the following paper if using any of our resources in an academic publication:

Contact

About

Emoji-Emotion dataset: An emoji-centric NLP resources based on Twitter Data

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages