Nov 1, 2006
-------------

- To think of multiple layers of neurons
	- we will consider a different perceptron

- Differentiable perceptron
	- unlike threshold perceptron
	- sigmoid perceptron

- Learning in multi-layer networks: backpropagation
	- how do we assign blame to hidden nodes?
	- use chain rule of differential calculus
		(one reason why you need differentiable units)

- Adapting perceptron equations for multi-layer networks
	- notion of "blame" (from output layer)
		- gets apportioned into blame (for hidden layers)

- Derivation of rules for weight updates

- How many hidden layers (of nodes) are needed anyway?
	- famous theorem says not more than two is required
	- can get away with even one, sometimes!

- How many nodes should there be in the hidden layer?
	- useful for forming hidden representations
	- example with weather
		- hidden layer nodes might correspond to seasons

- Be careful when looking at neural network literature
	- some people talk about "layers of weights", others mean
	  "layers of nodes"