Weights and Capacity

Some techniques for preventing overfitting assume that models with smaller weights have less capacity. Hinton proves this by showing that small weights make sigmoid units behave similarly to linear units.

Q1: Is this true for all non-linearities?

Q2: How else can we show that small weights lower capacity?


1 Response to “Weights and Capacity”

  1. 1 Geoffroy MOURET March 12, 2013 at 14:25

    Q1: It works for units that have a linear region around zero. In other words, functions that have small derivatives compared to the first one (so the Taylor series of the function can be considered linear). Examples of such units are: sigmoid or tanh. Examples of units that do not present linearity around zero: linear rectifier units or softplus.

    Q2: Smaller weights imply that the volume of the space containing accessible solutions is smaller. You will need more variations in your inputs to access some points that would have been easier to reach with bigger weights.

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: