### Denoising-CD and other

Q1 – Based on the similarity between CD1 and the autoencoder training objective and the relative performance between the vanilla AE and the DAE, would it make sense to think of a “denoising-CD” where you start your Markov Chains for the gradient’s negative phase from noisy versions of training data?

Q2 – On April 11th, you briefly mentionned a sampling procedure somewhat inspired from FPCD, can you give a more detailed explanation of the way it works?

#### 1 Response to “Denoising-CD and other”

1. 1 Xavier Bouthillier May 5, 2013 at 23:45

Q1 – DAE are trained to do burn-in. They get noisy inputs and output a denoized version of it, the burn in is in one single step. Adding noise to the input for CD could reduce the burn-in but this would need further experiments to confirm.

Q2 – In FPCD, the update function is :

$w_0 \leftarrow w_0 - \epsilon_0 \hat{g}$
$w_F \leftarrow \alpha w_F - \epsilon_F \hat{g}$
$w = w_0 + w_F$

In the new version of Breuleux et al. (http://www.iro.umontreal.ca/~lisa/pointeurs/breuleux+bengio_nc2011.pdf), $w_0$ is dropped and $w$ is updated only using $w_F$. This helps to get far from recently viewed examples.