C++ Neural Networks and Fuzzy Logic
by Valluru B. Rao M&T Books, IDG Books Worldwide, Inc. ISBN: 1558515526 Pub Date: 06/01/95 |
Previous | Table of Contents | Next |
We will be as specific as is needed to make the computations clear. First recall that the activation of a neuron in a layer other than the input layer is the sum of products of its inputs and the weights corresponding to the connections that bring in those inputs. Let us discuss the jth neuron in the hidden layer. Let us be specific and say j = 2. Suppose that the input pattern is (1.1, 2.4, 3.2, 5.1, 3.9) and the target output pattern is (0.52, 0.25, 0.75, 0.97). Let the weights be given for the second hidden layer neuron by the vector (–0.33, 0.07, –0.45, 0.13, 0.37). The activation will be the quantity:
(-0.33 * 1.1) + (0.07 * 2.4) + (-0.45 * 3.2) + (0.13 * 5.1) + (0.37 * 3.9) = 0.471
Now add to this an optional bias of, say, 0.679, to give 1.15. If we use the sigmoid function given by:
1 / ( 1+ exp(-x) ),
with x = 1.15, we get the output of this hidden layer neuron as 0.7595.
We are taking values to a few decimal places only for illustration, unlike the precision that can be obtained on a computer.
We need the computed output pattern also. Let us say it turns out to be actual =(0.61, 0.41, 0.57, 0.53), while the desired pattern is desired =(0.52, 0.25, 0.75, 0.97). Obviously, there is a discrepancy between what is desired and what is computed. The component-wise differences are given in the vector, desired - actual = (-0.09, -0.16, 0.18, 0.44). We use these to form another vector where each component is a product of the error component, corresponding computed pattern component, and the complement of the latter with respect to 1. For example, for the first component, error is –0.09, computed pattern component is 0.61, and its complement is 0.39. Multiplying these together (0.61*0.39*-0.09), we get -0.02. Calculating the other components similarly, we get the vector (–0.02, –0.04, 0.04, 0.11). The desired–actual vector, which is the error vector multiplied by the actual output vector, gives you a value of error reflected back at the output of the hidden layer. This is scaled by a value of (1-output vector), which is the first derivative of the output activation function for numerical stability). You will see the formulas for this process later in this chapter.
The backpropagation of errors needs to be carried further. We need now the weights on the connections between the second neuron in the hidden layer that we are concentrating on, and the different output neurons. Let us say these weights are given by the vector (0.85, 0.62, –0.10, 0.21). The error of the second neuron in the hidden layer is now calculated as below, using its output.
error = 0.7595 * (1 - 0.7595) * ( (0.85 * -0.02) + (0.62 * -0.04) + ( -0.10 * 0.04) + (0.21 * 0.11)) = -0.0041.
Again, here we multiply the error (e.g., -0.02) from the output of the current layer, by the output value (0.7595) and the value (1-0.7595). We use the weights on the connections between neurons to work backwards through the network.
Next, we need the learning rate parameter for this layer; let us set it as 0.2. We multiply this by the output of the second neuron in the hidden layer, to get 0.1519. Each of the components of the vector (–0.02, –0.04, 0.04, 0.11) is multiplied now by 0.1519, which our latest computation gave. The result is a vector that gives the adjustments to the weights on the connections that go from the second neuron in the hidden layer to the output neurons. These values are given in the vector (–0.003, –0.006, 0.006, 0.017). After these adjustments are added, the weights to be used in the next cycle on the connections between the second neuron in the hidden layer and the output neurons become those in the vector (0.847, 0.614, –0.094, 0.227).
Let us look at how adjustments are calculated for the weights on connections going from the ith neuron in the input layer to neurons in the hidden layer. Let us take specifically i = 3, for illustration.
Much of the information we need is already obtained in the previous discussion for the second hidden layer neuron. We have the errors in the computed output as the vector (–0.09, –0.16, 0.18, 0.44), and we obtained the error for the second neuron in the hidden layer as –0.0041, which was not used above. Just as the error in the output is propagated back to assign errors for the neurons in the hidden layer, those errors can be propagated to the input layer neurons.
To determine the adjustments for the weights on connections between the input and hidden layers, we need the errors determined for the outputs of hidden layer neurons, a learning rate parameter, and the activations of the input neurons, which are just the input values for the input layer. Let us take the learning rate parameter to be 0.15. Then the weight adjustments for the connections from the third input neuron to the hidden layer neurons are obtained by multiplying the particular hidden layer neuron’s output error by the learning rate parameter and by the input component from the input neuron. The adjustment for the weight on the connection from the third input neuron to the second hidden layer neuron is 0.15 * 3.2 * –0.0041, which works out to –0.002.
If the weight on this connection is, say, –0.45, then adding the adjustment of -0.002, we get the modified weight of –0.452, to be used in the next iteration of the network operation. Similar calculations are made to modify all other weights as well.
Previous | Table of Contents | Next |