Alex’s CNN structure


In Alex Krizhevsky’s CNN, the first and third layer are full connected Convolutional  layers. Why first and third, what if we choose other layers: for example: second and fourth.


Why use a network has big capacity than the training set(easily over-fitting). What happens if use a slightly smaller network.




1 Response to “Alex’s CNN structure”

  1. 1 Sina Honari March 11, 2013 at 15:28

    Q1- Most probably these configurations have give him the best results. Meanwhile, he was trying to use the biggest model he could on his machine while having minimum connection in between the two GPUs.

    Q2-Using a bigger network gives better generalization power to the model at the cost of adding the capacity, however, the side-affects can be curbed by using correct regularization techniques like drop-outs, early stopping, weight-decay, etc.

Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: