Nov 1, 2006 ------------- - To think of multiple layers of neurons - we will consider a different perceptron - Differentiable perceptron - unlike threshold perceptron - sigmoid perceptron - Learning in multi-layer networks: backpropagation - how do we assign blame to hidden nodes? - use chain rule of differential calculus (one reason why you need differentiable units) - Adapting perceptron equations for multi-layer networks - notion of "blame" (from output layer) - gets apportioned into blame (for hidden layers) - Derivation of rules for weight updates - How many hidden layers (of nodes) are needed anyway? - famous theorem says not more than two is required - can get away with even one, sometimes! - How many nodes should there be in the hidden layer? - useful for forming hidden representations - example with weather - hidden layer nodes might correspond to seasons - Be careful when looking at neural network literature - some people talk about "layers of weights", others mean "layers of nodes"