Multi View Sentiment Corpus (MSVC)

The Multi-View Sentiment Corpus (MVSC) consists of a set of 3000 tweets that has been labelled by 3 annotators considering different views related to the same post:

subjectivity/objectivity
sentiment polarity
implicitness/explicitness
presence of irony
emotion

All the labels are given for each annotator!

Moreover, each emoji of a tweet (if present) has been labelled as positive, negative, neutral or topic-related. This corpus has been published at EACL 2017, click here to read the paper.

The MVSC corpus includes two tab-separated files:

MVSC Tweet Corpus.csv
MVSC Emoji Corpus.csv

The annotation has been provided for each A* annotator (A1, A2, A3).

The MVSC Tweet Corpus file is formatted as follow: tweet_id, tweet_text, user_id, keyword, Emotion_A*, Irony_A*, Implicit-Explicit_A*, Subjectivity_A*, Tweet_Sentiment_A*

Each entry contains:

tweet_id is the numerical ID of the tweet;
tweet_text is the UTF-8 text of the tweet;
user_id is the integer representation of the unique identifier for the user who emitted the post;
keyword is the word used to retrieve the tweet, related to two popular movies: deadpool and suicidesquad;
Emotion_A* is the emotion label: anger, anticipation,joy, trust, fear, surprise, sadness, disgust and None;
Irony_A* is the irony label: Ironic and Not Ironic;
Implicit-Explicit_A* is the label regarding the implicitness/explicitness: Explicit, Implicit and None;
Subjective_A* is the label regarding the subjectivity/objectivity: Subjective and Objective;
Tweet_Sentiment_A* is the sentiment polarity label: Positive, Negative and Neutral.

The MVSC Emoji Corpus file is formatted as follow: tweet_id, emoji_position, Tweet_Sent_Emoji_A*

Each entry contains:

tweet_id is the numerical ID of the tweet;
emoji_position is the position of the emoji ordered by appereance (e.g. "2" is related to the emoji that appears as second);
Tweet_Sentiment_A* is the label regarding the sentiment polarity and the topic-relatedness: Topic, Positive, Negative and Neutral.

Licensing

All of the data from Twitter (tweet_id, tweet_text, user_id) is covered by Twitter's Terms of Service: https://dev.twitter.com/terms/api-terms

Authors:

Debora Nozza, University of Milano-Bicocca ([email protected])
Elisabetta Fersini, University of Milano-Bicocca ([email protected])
Enza Messina, University of Milano-Bicocca ([email protected])

If you decide to use this dataset, please cite:

::

inproceedings{nozza-etal-2017-multi,
title = "A Multi-View Sentiment Corpus",
author = "Nozza, Debora  and
  Fersini, Elisabetta  and
  Messina, Enza",
booktitle = "Proceedings of the 15th Conference of the {E}uropean Chapter of the Association for Computational Linguistics: Volume 1, Long Papers",
month = apr,
year = "2017",
address = "Valencia, Spain",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/E17-1026",
pages = "273--280",

}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
MVSC Emoji Corpus.txt		MVSC Emoji Corpus.txt
MVSC Tweet Corpus.txt		MVSC Tweet Corpus.txt
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multi View Sentiment Corpus (MSVC)

About

Releases

Packages

MIND-Lab/MVSC

Folders and files

Latest commit

History

Repository files navigation

Multi View Sentiment Corpus (MSVC)

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages