Massively Multilingual NMT
Link to paper: https://arxiv.org/abs/1903.00089
Neural Machine Translation (NMT) is the challenge of using neural networks to translate languages from one to another. Google Translate is famously good at this, and is where the latter two authors on this paper work. NMT is difficult largely due to the plethora ways humans choose to communicate with one another.
Dialects, jargon, slang; all of these feed in to the challenge of performing NMT, even before one considers the number of languages that are spoken.
Improving tf-idf weighted document vector embedding
Link to paper: https://arxiv.org/abs/1902.09875
In this work, Schmidt provides a simple and coherent derivation of what he calls an “optimal embedding” for a document. Optimality is defined such that it maximizes the similiarty of documents in downstream tasks. For example, as an employee at TripAdvisor, he would like to group reviews about the same location.
Schmidt approaches the calculation of an optimal embedding through a weighted sum of word2vec skip-gram embeddings.