Perceptrons and Sigmoid Neurons

A way you can think about the perceptron is that it's a device that makes decisions by weighing up evidence. Let me give an example. It's not a very realistic example, but it's easy to understand, and we'll soon get to more realistic examples. Suppose the weekend is coming up, and you've heard that there's going to be a cheese festival in your city. You like cheese, and are trying to decide whether or not to go to the festival. You might make your decision by weighing up three factors:

1. Is the weather good?

2. Does your boyfriend or girlfriend want to accompany you?

3. Is the festival near public transit? (You don't own a car).

We can represent these three factors by corresponding binary variables x1,x2, and x3. For instance, we'd have x1=1 if the weather is good, and x1=0 if the weather is bad. Similarly, x2=1 if your boyfriend or girlfriend wants to go, and x2=0 if not. And similarly again for x3 and public transit.

Now, suppose you absolutely adore cheese, so much so that you're happy to go to the festival even if your boyfriend or girlfriend is uninterested and the festival is hard to get to. But perhaps you really loathe bad weather, and there's no way you'd go to the festival if the weather is bad. You can use perceptrons to model this kind of decision-making. One way to do this is to choose a weight w1=6 for the weather, and w2=2 and w3=2 for the other conditions. The larger value of w1 indicates that the weather matters a lot to you, much more than whether your boyfriend or girlfriend joins you, or the nearness of public transit. Finally, suppose you choose a threshold of 5 for the perceptron. With these choices, the perceptron implements the desired decision-making model, outputting 1 whenever the weather is good, and 0 whenever the weather is bad. It makes no difference to the output whether your boyfriend or girlfriend wants to go, or whether public transit is nearby.

By varying the weights and the threshold, we can get different models of decision-making. For example, suppose we instead chose a threshold of 3. Then the perceptron would decide that you should go to the festival whenever the weather was good or when both the festival was near public transit and your boyfriend or girlfriend was willing to join you. In other words, it'd be a different model of decision-making. Dropping the threshold means you're more willing to go to the festival.


Sigmoid neurons are similar to perceptrons, but modified so that small changes in their weights and bias cause only a small change in their output. That's the crucial fact which will allow a network of sigmoid neurons to learn.


Just like a perceptron, the sigmoid neuron has inputs, x1,x2,…. But instead of being just 0 or 1, these inputs can also take on any values between 0 and 1. So, for instance, 0.638… is a valid input for a sigmoid neuron. Also just like a perceptron, the sigmoid neuron has weights for each input, w1,w2,…, and an overall bias, b. But the output is not 0 or 1. Instead, it's σ(w⋅x b), where σ is called the sigmoid function* *Incidentally, σ is sometimes called the logistic function, and this new class of neurons called logistic neurons. It's useful to remember this terminology, since these terms are used by many people working with neural nets. However, we'll stick with the sigmoid terminology.


Two tool for machine learning.

Folksonomies: artificial intelligence machine learning

/science/weather (0.500869)
/art and entertainment/shows and events/festival (0.476871)
/health and fitness/disease (0.428710)

public transit (0.954611 (positive:0.698641)), sigmoid neurons (0.803223 (negative:-0.218204)), weather (0.775974 (negative:-0.159006)), sigmoid neuron (0.728404 (neutral:0.000000)), boyfriend (0.723108 (positive:0.029825)), girlfriend (0.718492 (positive:0.029825)), perceptron (0.713191 (positive:0.164262)), festival (0.663165 (positive:0.219321)), machine learning (0.574774 (negative:-0.227965)), logistic neurons (0.573279 (neutral:0.000000)), realistic examples (0.565685 (positive:0.630707)), decision-making (0.561915 (positive:0.433821)), realistic example (0.558360 (neutral:0.000000)), binary variables (0.556799 (neutral:0.000000)), threshold (0.556295 (negative:-0.451091)), decision-making model (0.552419 (neutral:0.000000)), cheese festival (0.551569 (neutral:0.000000)), weight w1=6 (0.551544 (negative:-0.395158)), perceptrons (0.549531 (positive:0.215617)), larger value (0.549143 (neutral:0.000000)), bad weather (0.544516 (negative:-0.794769)), crucial fact (0.542599 (neutral:0.000000)), new class (0.542558 (neutral:0.000000)), logistic function (0.541918 (neutral:0.000000)), different models (0.539456 (neutral:0.000000)), neural nets (0.538176 (neutral:0.000000)), bias cause (0.538034 (neutral:0.000000)), sigmoid function* (0.535457 (neutral:0.000000)), small changes (0.535089 (neutral:0.000000)), small change (0.534863 (neutral:0.000000))

Neural network (0.949974): dbpedia | freebase | opencyc
Logistic regression (0.891102): dbpedia | freebase
Output (0.842120): dbpedia | freebase
Input (0.685012): dbpedia
Artificial neural network (0.677646): dbpedia | freebase | yago
Input/output (0.601191): dbpedia | freebase
Action potential (0.569167): dbpedia | freebase

 Neural Networks and Deep Learning
Books, Brochures, and Chapters>Book Chapter:  Nielsen, Michael (2014), Neural Networks and Deep Learning, Retrieved on 2014-04-21
  • Source Material []
  • Folksonomies: artificial intelligence