**Q1:**

After training a Deep Belief Network using the approximate posteriors, you can use the weights of this network to initialize a multi-layer neural network. Does this work regardless of the type of neurons that you use in your MLN?

**Q2:**

What are the pros and cons of using mean-field computation rather than stochastic sampling when training a DBN?

### Like this:

Like Loading...

*Related*

Q1: No, if you are using the DBN to initialize the weights of an MLP, you must use the same non-linearities in both models. There is no reason to believe that using different non-linearities will make the MLP behave adequately. As a sidenote, if for some reason you’re using sigmoids in the DBN and $tanh$ units in the MLP, you can use the equation as $tanh(x) = 2 \times sigmoid(2x)-1$ to convert the weights. However, MLPs are not invariant to such re-parameterizations, so the results you get in the end will be different than those you would get by using sigmoid units. This might be because the biases of the $tanh$ units will be larger and will take more time to learn, so the optimization dynamics will be different, and thus the model you obtain will be different.

Q2: The first thing to note is that there exists no training algorithm for Deep Belief Nets. We can’t show that wake-sleep maximizes likelihood. What we do know is that when you use DBNs to pre-train MLPs, you obtain better performance. At any rate, training a DBN means training a number of Restricted Boltzmann Machines, and this training is stochastic. Using mean-field approximations makes training deterministic, like in an MLP.