what they are and how they work

author: Mark Ibrahim

Building Blocks


Perceptrons were developed in the 1950-60s, inspiring the sigmoid neurons used in most neural networks. A percepton takes a finite number of binary inputs and produces a binary output using a weighted sum.



b, is called the bias, and helps define the threshold at which we return each output.

Why does this work? Roughly, this models how humans make decisions by assigning various weights to inputs. Thinking about whether to attend a concert involves what band is playing, what the weather is like, and which friends are attending, works similarly. Each factor carries an importance (weight) that contributes to the outcome of whether to attend.

Perceptron Algorithm

Begins by initializing weights and bias to 0 (or a small random value).

# X = input vector
# Y = Desired output
for (X,Y) in incorrectly_classified:
	# update weights and bias 
	for xi in X:
	  # eta = learning rate
		w[i] += eta*(Y - f(X))*xi

we repeat this for a predefined number of epochs.

Will this ever end? There's actually a theorem, called the Perceptron Convergence Theorem, proving for any linearly separable data set, the perceptron algorithm will converge after a finite number of epochs.

Who cares? The remarkable aspect of the perceptron algorithm is that no human is involved in teaching the algorithm. Instead, by comparing the desired and actual outputs, the algorithm is able to learn by tweaking the weights on its own!

Sigmoid Function