Q1 – What a recursive autoencoders? What makes them more general than the RNN which they generalize?

Q2 – The review paper refers to something called “noise contrastive estimation”, what is it?

Course material for graduate class ift6266h13

Tags: Pierre Luc Carrier

Q1: To illustrate how recursive AEs work, let’s look at how they have been used for natural language processing. We can use an AE to learn a representation of a sequence of 2 words. We can then apply AEs to any sequence of 2 words, using the same parameters. We can also use an AE to model a word and the representation of 2 words. In this way, we can construct a tree to learn representations for arbitrarily long sequences of words. These representations can then be used for many NLP tasks. The recurrent neural network is a specific case of recursive AEs, in which the structure of the models is a simple chain.

Q2: Noise contrastive estimation is a method for estimating the parameters of an energy-based model without having to use the gradient of the log-likelihood. Basically, it relies on negative examples sampled from a distribution which is much flatter than the true distribution, and the use of a classifier to discriminate between positive and negative examples.