Formality Style Transfer
Link to paper: https://arxiv.org/abs/1903.06353
The two sentences:
Gotta see both sides of the story
You have to consider both sides of the story
contain the same content, but one is more formal than the other. Performing the “translation” of one to the other can be framed as a style transfer task. Style transfer is typically thought about in the artistic domain, which makes this NLP paper particularly interesting.
To tune or not to tune?
Link to paper: https://arxiv.org/abs/1903.05987
When building deep learning NLP models, a common way to speed-up the training process is to get someone else to do most of the work for you. That is, transfer learn from a pre-trained model. Good pre-trained models are those that have been exposed to a wide variety of text, so their representation of semantic/syntactic space is useful for many tasks. One can adapt these models by:
Seeing language through character-level taggers
Link to paper: https://arxiv.org/abs/1903.05041
This is a short-and-sweet paper that answers a well-defined question: does a language’s morphology and orthography — the way in which it is constructed grammatically — change the way in which recurrent models encode it?
To perform part-of-speech (POS) tagging (identifying nouns, adjectives, etc.), modellers typically employ recurrent neural nets such as LSTMs to read-through the text and build a probability of each word belonging to a given class.
Reinforcement Learned Curricula for NMT
Link to paper: https://arxiv.org/abs/1903.00041
An interesting aspect of training deep NLP models is that the order you present training data to them matters. This can be very useful for transfer learning: train the model on a lot of general data from a large corpus, and then fine-tune the model on less noisy, more in-domain data that you really care about it understanding. This way, modellers can take advantage of both a wide understanding of semantics and syntax, and a more narrow in-depth understanding of the task at hand.
Making speech recognition datasets at scale with YouTube
Link to paper: https://arxiv.org/abs/1903.00216
Speech recognition is a messy problem. Audio data can suffer from all sorts of complications: multiple speakers to differentiate between, background noise and compression artifacts all come to mind. On top of these, training a machine learning model to peform the “translation” between audio and text typically requires word- or phoneme-level alignments between the two. Doing this to high quality, at scale, is hard, and it’s exactly what this paper is about.