The practice for this week takes place in notebooks. Just open them and follow instructions from there.

Unless explicitly said otherwise, all subsequent weeks follow the same pattern (notebook with instructions).

If you have any difficulties with notebooks, just open them in .

On hierarchical & sampled softmax estimation for word2vec page
GloVe project page
FastText project repo
Semantic change over time - oberved through word embeddings - arxiv
Another cool link that you could have shared, but decided to hesitate. Or did you?

Distributed Representations of Words and Phrases and their Compositionality Mikolov et al., 2013 [arxiv]
Efficient Estimation of Word Representations in Vector Space Mikolov et al., 2013 [arxiv]
Distributed Representations of Sentences and Documents Quoc Le et al., 2014 [arxiv]
GloVe: Global Vectors for Word Representation Pennington et al., 2014 [article]
Enriching word vectors with subword information Bojanowski et al., 2016 [arxiv] (FastText)

Word2vec Explained: Deriving Mikolov et al.’s Negative-Sampling Word-Embedding Method Yoav Goldberg, Omer Levy, 2014 [arxiv]
Don’t count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors Marco Baroni, Georgiana Dinu, Germa ́n Kruszewski, ACL 2014, [paper]
The strange geometry of skip-gram with negative sampling, David Mimno, Laure Thompson, EMNLP 2017, [paper]
Characterizing Departures from Linearity in Word Translation Ndapa Nakashole, Raphael Flauger, ACL 2018, [paper]
On the Dimensionality of Word Embedding Zi Yin, Yuanyuan Shen, NIPS 2018, [arxiv]
Analogies Explained: Towards Understanding Word Embeddings Carl Allen, Timothy Hospedales, ICML 2019, [arxiv] [official blog post]

Exploiting similarities between languages for machine translation Mikolov et al., 2013 [arxiv]
Improving vector space word representations using multilingual correlation Faruqui and Dyer, EACL 2014 [pdf]
Learning principled bilingual mappings of word embeddings while preserving monolingual invariance Artetxe et al., EMNLP 2016 [pdf]
Offline bilingual word vectors, orthogonal transformations and the inverted softmax [arxiv] Smith et al., ICLR 2017
Word Translation Without Parallel Data Conneau et al., 2018 [arxiv]

Provide feedback

Saved searches