Neural Networks: On Perceptrons and Sigmoid Neurons

Neural Networks had their beginnings in 1943 when Warren McCulloch, a neurophysiologist, and a young mathematician, Walter Pitts, wrote a paper on how neurons might work. Much later in 1958, Frank Rosenblatt, a neuro-biologist proposed the Perceptron. The Perceptron is a computer model or computerized machine which is devised to represent or simulate the ability of the brain to recognize and discriminate. In machine learning, the perceptron is an algorithm for supervised learning of binary classifiers

Initially it was believed that Perceptrons were capable of many things including “the ability to walk, talk, see, write, reproduce itself and be conscious of its existence.”

However, a subsequent paper by Marvin Minky and Seymour Papert of MIT, titled “Perceptrons” proved that the Perceptron was truly limited in its functionality. Specifically they showed that the Perceptron was incapable of producing XOR functionality. The Perceptron is only capable of classification where the data points are linearly separable.

Checkout my book ‘Deep Learning from first principles: Second Edition – In vectorized Python, R and Octave’. My book starts with the implementation of a simple 2-layer Neural Network and works its way to a generic L-Layer Deep Learning Network, with all the bells and whistles. The derivations have been discussed in detail. The code has been extensively commented and included in its entirety in the Appendix sections. My book is available on Amazon as paperback ($18.99) and in kindle version($9.99/Rs449).

This post implements the simple learning algorithm of the ‘Linear Perceptron’ and the ‘Sigmoid Perceptron’. The implementation has been done in Octave. This implementation is based on “Neural networks for Machine Learning” course by Prof Geoffrey Hinton at Coursera

Perceptron learning procedure
z = ∑w_ix_i + b
where w_i is the i^thweight and x_iis the i_th feature

For every training case compute the activation output z_i

If the output classifies correctly, leave the weights alone
If the output classifies a ‘0’ as a ‘1’, then subtract the the feature from the weight
If the output classifies a ‘0’ as a ‘1’, then add the feature to the weight

This simple neural network is represented below
perceptron

Sigmoid neuron learning procedure
z_i= sigmoid(∑w_ix_i + b)
where sigmoid is
$sigmoid(z) = 1/1+e^{-z}$

Hence
$z_{i} = 1/1+e^{-(\sum w_{i}x_{i}+b)}$
For every training case compute the activation output z_i

If the output classifies correctly, leave the weights alone
If the output incorrectly classifies a ‘0’ as a ‘1’ i.e. $z_{i} >sigmoid(0)$ , then subtract the feature from the weight
If the output incorrectly classifies a ‘1’ as ‘0’ i.e., i.e $z_{i} < sigmoid(0)$ , then add the feature to the weight
Iterate till errors <= 1

This is shown below
sigmoid_neuron

I have implemented the learning algorithm of the Perceptron and Sigmoid Neuron in Octave. The code is available at Github at Perceptron.

Perceptron execution

I performed the tests on 2 different datasets

Data 1
untitled

Data 2
untitled

2. Sigmoid Perceptron execution
Data 1 & Data 2

It can be seen that the Perceptron does work for simple linearly separable data. I will be implementing other more advanced Neural Networks in the months to come.

Watch this space!