What is complementary prior and how does it cancel out the explaining away effect?

In the video Geoffrey Hinton says that probably CD-1 is better than maximum likelihood gradient. Why?

Why don’t people do for example CD1.5 to get the reconstruction and compute the reconstruction error in order to measure how well RBM learns the input data.

Can we say PCA corresponds to an autoencoder with tied weights. Let’s assume that you have an autoencoder with weights W and inputs X (assuming that we aren’t using biases). Where W tries to learn the same space spanned by the principal components with SGD and encoded representation will be h = WX. If we use tied weights, encoder will try to learn W’ with constraint W’W=I. So if we have perfect reconstruction W’WX will be X.

### Like this:

Like Loading...

*Related*

## 0 Responses to “Infinite Sigmoid belief nets and AEs vs PCA”