What are the differences between weight penalties and weight constraints? What are the advantages and drawbacks?

Advertisements

Course material for graduate class ift6266h13

Tags: weight constraints, Weight penalties, Xavier Bouthillier

What are the differences between weight penalties and weight constraints? What are the advantages and drawbacks?

Advertisements

Advertisements

%d bloggers like this:

When using a weights penalty, you inclure an new term representing the penalty in the cost function you optimize. The cost to obtimize thus becomes : $Cost + \lambda ||\Theta||^2 $.

When using a weight constraint, you optimize the same cost as before but such that the constraint $ ||\Theta||^2 < C $ is respected.

Weight penalty and weight constraints are somewhat equivalent because for every $\lambda$ value used with the penalty you can find a value of C in the weight constraint that for which the optimal solution will be the same. However, they differ in the fact that weight penalty starts pushing down on the weight norms as soon as training starts while weight constraints will have no effect on the weights as long as their norm is smaller than C. This sort of gives the weights a 'grace' period where they can evolve more freely.

Sorry about that. It seems I did not format my latex equations correctly.

Weight penalty : .

Weight constraint :