CNN and ‘argmax pooling’

In his videos, Hinton argues that convolutional neural networks are doomed because pooling loses the precise spatial relationship between high level parts (such as nose and mouth in the case of a face).

Could you lessen this issue by using a sort of “argmax-and-max pooling” where you propagate both the max response from the pooling region and also the location of the max response relative to the boundries of the pooling region?

1 Response to “CNN and ‘argmax pooling’”


  1. 1 Xavier Bouthillier April 25, 2013 at 15:36

    Ranzato et al. (http://www.cs.nyu.edu/~ylan/files/publi/ranzato-cvpr-07.pdf) have tried something similar. They build a convolutional auto-encoder that tries to reconstruct the image from the pooled feature maps given the position of the max pooling. You can see in figure 2.


Leave a comment