Infinite Sigmoid belief nets and AEs vs PCA

What is complementary prior and how does it cancel out the explaining away effect?

In the video Geoffrey Hinton says that probably CD-1 is better than maximum likelihood gradient. Why?

Why don’t people do for example CD1.5 to get the reconstruction and compute the reconstruction error in order to measure how well RBM learns the input data.

Can we say PCA corresponds to an autoencoder with tied weights. Let’s assume that you have an autoencoder with weights W and inputs X (assuming that we aren’t using biases). Where W tries to learn the same space spanned by the principal components with SGD and encoded representation will be h = WX. If we use tied weights, encoder will try to learn W’ with constraint W’W=I. So if we have perfect reconstruction W’WX will be X.


0 Responses to “Infinite Sigmoid belief nets and AEs vs PCA”

  1. Leave a Comment

Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: