  1. 1 Sina Honari May 5, 2013 at 14:20

    Considering question 4 in finalH11.pdf,

    1) what difference having binary or real-valued input will make?

    2) does using tanh (instead of sigm) for encoding avoids learning the identity function for the binary input (since one saturated output is -1 instead of 0)?

  2. 3 Sina Honari May 4, 2013 at 23:20

    I would like to know the answer to question 4 of file finalH12en.pdf especially for the cases (b) and (c) . Here are my answers: using the same format (1-with, 1-w/o, 2-with, 2-w/o)

    a) 1-w/o: decrease 2-w/o:u-shaped curve, 1-with: decrease, 2-with: u-shaped (I expect the test error to increase much later compared to the previous case and over a very long iteration number)

    b) 1-w/o: increase, 2-w/o: decrease, 1-with: increase, 2-with: decrease

    c) 1-w/o: no-change (I expect the auto-encoder to reconstruct the same corrupted input w/o applying the denoising criterion. 2-w/o:increase 1-with: increase 2-with:u-shape (I expect the test error to decrease by increasing the corruption level and then increase when it passes some threshold and that is when the corruption level is so high that it may get into the nearby values of other training data.)

    d) 1-w/o: decrease, 2-w/o: decrease (I expect the hidden unit learn the identity matrix which gives low error both on training and testing data)
    1-with: decrease, 2-with: u-shaped (I expect this case to avoid learning the identity matrix yet overfit by increase in the number of hidden units after some threshold)

    • 4 Yoshua Bengio May 4, 2013 at 23:37

      a) correct except 2-w/o: decrease
      b) correct except 2-w/o: increase, 2-with: U-shaped
      c) I believe they will all increase
      d) correct (all the same as (a))

  3. 5 Yoshua Bengio May 3, 2013 at 11:50

    To clarify: the exam is open-book, but no laptop.

  4. 6 Yoshua Bengio April 23, 2013 at 17:37

