A-priori knowledge

In the first part of his course, Geoffrey Hinton presents the family tree problem using this representation:

For the 24 people involved, the local encoding is created using a sparse 24-dimensional vector with all components zero, except one. E.g. Colin ≡(1,0,0,0,0,…,0), Charlotte ≡(0,0,1,0,0,…,0), Victoria ≡(0,0,0,0,1,…,0) and so on.

Rather than 5-dimensional vectors representing a binary value for each person:

Colin ≡(0,0,0,0,1), Charlotte ≡(0,0,0,1,1), Victoria ≡(0,0,1,0,1) etc…

He then says that:

Considering the way this encoding is used, the 24-d encoding asserts no a-priori knowledge about the persons while the 5-d one does.

Why is that?


1 Response to “A-priori knowledge”

  1. 1 Gabriel Bernier-Colborne February 7, 2013 at 16:19

    The binary encoding above suggests some similarity between Colin, Charlotte, and Victoria, as they all have the same value for the last bit. A one-hot representation asserts no such similarity, and all possible values represent different corners of a hypercube. As a side-note, increasing the dimensionality makes the optimization problem easier, but does not necessarily guarantee better generalization.

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: