Posts Tagged 'k-means'

Analogy of K-means to Sparsity Constraint

The usual output of K-means is a one-hot encoding because of the winner-take-all. Leave this unchanged for the learning rule, but let the output be a softmax of the negative distance of an input to each centroid (or something similar). The output will probably still be very sparse, probably more so in high-dimensional space because of the curse of dimensionality (right?).

If we think of a sparsity constraint on hidden representations, these will also end up being sparse. Both are based on the same kind of underlying linear/affine transformation during feedforward. Isn’t the sparsity constraint imposing the same kind of clustering of inputs into centroids as K-means, thus making it a local generalization where different regions in the input space are basically associated with there own private set of parameters?

How does the backpropagated error affect the equivalent centroids in the network with a sparsity constraint?