Oct 30, 2006 ------------- - Back to neural networks - solution to XOR puzzle - Any boolean function can be realized in at most 2 layers - use CNF - or DNF - Learning algorithms for a perceptron - used to find the weights - Example of learning algorithm for threshold perceptron - use sign(x) for thresholding - backpropagate errors and assign blame - importance of learning rate, eta - typically small, like 0.1 or 0.01 - Algorithm converges when - points are linearly separable - What happens when not linearly separable - weights go back and forth! - Another perceptron: remove the thresholding - easy to work with mathematically - derivation of weight update rule by calculus - Deriving learning update rules - formulate sum-of-squared error criterion - differentiate w.r.t. weight space! - Why do we need learning rate? - to take small steps toward goal ("nudges"), rather than big leaps - Comparisons between unthresholded and thresholded perceptrons - first will still fit linearly non-separable problems - by finding some in-between solution - second will oscillate - Comparisons between unthresholded and thresholded perceptrons (contd). - weight rules look the same - but the meaning of output "o" is different in each - Comparisons between unthresholded and thresholded perceptrons (contd). - unthresholded derivation gave a batch rule - accumulate all weight updates across all examples - apply the weight update - go back and see all examples - thresholded rule is incremental - you adjust weights after seeing just one example - Next class - we will see how to work with multiple layers of neurons