Neural Networks: On Perceptrons and Sigmoid Neurons

Neural Networks had their beginnings in 1943 when Warren McCulloch, a neurophysiologist, and a young mathematician, Walter Pitts, wrote a paper on how neurons might work.  Much later in 1958, Frank Rosenblatt, a neuro-biologist proposed the Perceptron. The Perceptron is a computer model or computerized machine which is devised to represent or simulate the ability of the brain to recognize and discriminate. In machine learning, the perceptron is an algorithm for supervised learning of binary classifiers

Initially it was believed that  Perceptrons were capable of many things including “the ability to walk, talk, see, write, reproduce itself and be conscious of its existence.”

However, a subsequent paper by Marvin Minky and Seymour Papert of MIT, titled “Perceptrons” proved that the Perceptron was truly limited in its functionality. Specifically they showed that the Perceptron was incapable of producing XOR functionality. The Perceptron is only capable of classification where the data points are linearly separable.

Checkout my book ‘Deep Learning from first principles: Second Edition – In vectorized Python, R and Octave’. My book starts with the implementation of a simple 2-layer Neural Network and works its way to a generic L-Layer Deep Learning Network, with all the bells and whistles. The derivations have been discussed in detail. The code has been extensively commented and included in its entirety in the Appendix sections. My book is available on Amazon as paperback ($18.99) and in kindle version($9.99/Rs449).

This post implements the simple learning algorithm of the ‘Linear Perceptron’ and the ‘Sigmoid Perceptron’.  The implementation has been done in Octave. This implementation is based on “Neural networks for Machine Learning” course by Prof Geoffrey Hinton at Coursera

Perceptron learning procedure
z = ∑wixi  + b
where wi is the ith weight and xi is the ith  feature

For every training case compute the activation output zi

  • If the output classifies correctly, leave the weights alone
  • If the output classifies a ‘0’ as a ‘1’, then subtract the the feature from the weight
  • If the output classifies a ‘0’ as a ‘1’, then add the feature to the weight

This simple neural network is represented below
perceptron

Sigmoid neuron learning procedure
zi = sigmoid(∑wixi  + b)
where sigmoid is
sigmoid(z) = 1/1+e^{-z}

Hence
z_{i} = 1/1+e^{-(\sum w_{i}x_{i}+b)}
For every training case compute the activation output zi

  • If the output classifies correctly, leave the weights alone
  • If the output incorrectly classifies a ‘0’ as a ‘1’ i.e. z_{i} >sigmoid(0), then subtract the feature from the weight
  • If the output incorrectly classifies a ‘1’ as ‘0’ i.e., i.e z_{i} < sigmoid(0), then add the feature to the weight
  • Iterate till errors <= 1

This is shown below
sigmoid_neuron

I have implemented the learning algorithm of the Perceptron and Sigmoid Neuron in Octave. The code is available at Github at Perceptron.

  1. Perceptron execution

I performed the tests on 2 different datasets

Data 1
untitled

Data 2
untitled

2. Sigmoid Perceptron execution
Data 1 & Data 2

It can be seen that the Perceptron does work for simple linearly separable data. I will be implementing other more advanced Neural Networks in the months to come.

Watch this space!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s