- lecture slides
- Our videos: intro, lecture, seminar
- Lecture video from Stanford CS224N - intro, embeddings (english)
The practice for this week takes place in notebooks. Just open them and follow instructions from there.
- Seminar:
./seminar.ipynb
- Homework:
./homework.ipynb
Unless explicitly said otherwise, all subsequent weeks follow the same pattern (notebook with instructions).
If you have any difficulties with notebooks, just open them in .
- On hierarchical & sampled softmax estimation for word2vec page
- GloVe project page
- FastText project repo
- Semantic change over time - oberved through word embeddings - arxiv
- Another cool link that you could have shared, but decided to hesitate. Or did you?
-
Distributed Representations of Words and Phrases and their Compositionality Mikolov et al., 2013 [arxiv]
-
Efficient Estimation of Word Representations in Vector Space Mikolov et al., 2013 [arxiv]
-
Distributed Representations of Sentences and Documents Quoc Le et al., 2014 [arxiv]
-
GloVe: Global Vectors for Word Representation Pennington et al., 2014 [article]
-
Enriching word vectors with subword information Bojanowski et al., 2016 [arxiv] (FastText)
-
Word2vec Explained: Deriving Mikolov et al.’s Negative-Sampling Word-Embedding Method Yoav Goldberg, Omer Levy, 2014 [arxiv]
-
Don’t count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors Marco Baroni, Georgiana Dinu, Germa ́n Kruszewski, ACL 2014, [paper]
-
The strange geometry of skip-gram with negative sampling, David Mimno, Laure Thompson, EMNLP 2017, [paper]
-
Characterizing Departures from Linearity in Word Translation Ndapa Nakashole, Raphael Flauger, ACL 2018, [paper]
-
On the Dimensionality of Word Embedding Zi Yin, Yuanyuan Shen, NIPS 2018, [arxiv]
-
Analogies Explained: Towards Understanding Word Embeddings Carl Allen, Timothy Hospedales, ICML 2019, [arxiv] [official blog post]
-
Exploiting similarities between languages for machine translation Mikolov et al., 2013 [arxiv]
-
Improving vector space word representations using multilingual correlation Faruqui and Dyer, EACL 2014 [pdf]
-
Learning principled bilingual mappings of word embeddings while preserving monolingual invariance Artetxe et al., EMNLP 2016 [pdf]
-
Offline bilingual word vectors, orthogonal transformations and the inverted softmax [arxiv] Smith et al., ICLR 2017
-
Word Translation Without Parallel Data Conneau et al., 2018 [arxiv]