Oct 30, 2006
-------------

- Back to neural networks
	- solution to XOR puzzle

- Any boolean function can be realized in at most 2 layers
	- use CNF
	- or DNF

- Learning algorithms for a perceptron
	- used to find the weights

- Example of learning algorithm for threshold perceptron
	- use sign(x) for thresholding
	- backpropagate errors and assign blame
	- importance of learning rate, eta
		- typically small, like 0.1 or 0.01

- Algorithm converges when
	- points are linearly separable

- What happens when not linearly separable
	- weights go back and forth!

- Another perceptron: remove the thresholding
	- easy to work with mathematically
	- derivation of weight update rule by calculus

- Deriving learning update rules
	- formulate sum-of-squared error criterion
	- differentiate w.r.t. weight space!

- Why do we need learning rate?
	- to take small steps toward goal ("nudges"),
          rather than big leaps

- Comparisons between unthresholded and thresholded perceptrons
	- first will still fit linearly non-separable problems
		- by finding some in-between solution
	- second will oscillate

- Comparisons between unthresholded and thresholded perceptrons (contd).
	- weight rules look the same
		- but the meaning of output "o" is different in each

- Comparisons between unthresholded and thresholded perceptrons (contd).
	- unthresholded derivation gave a batch rule
		- accumulate all weight updates across all examples
		- apply the weight update
		- go back and see all examples
	- thresholded rule is incremental
		- you adjust weights after seeing just one example

- Next class
	- we will see how to work with multiple layers of neurons