【cS231n Convolutional Neural Networks for Visual Recognition】Intuitive understanding of backpropagation

已知 x0,x1,w0,w1,w2,y

g = 1 / (1 + math.exp( -((x0 * w0) + (x1 * w1) + w2)))

损失函数 f = y - g

使用BP算法，调整w0,w1，w2使得 f <0.1

x0 = -1
x1 = -2
w0 = 2
w1 = -3
w2 = -3
y = 1.73

https://cs231n.github.io/optimization-2/

原文例程：

For example, the sigmoid expression receives the input 1.0 and computes the output 0.73 during the forward pass.

The derivation above shows that the local gradient would simply be (1 - 0.73) * 0.73 ~= 0.2,

as the circuit computed before (see the image above),

except this way it would be done with a single,

simple and efficient expression (and with less numerical issues).

Therefore, in any real practical application it would be very useful to group these operations into a single gate. Lets see the backprop for this neuron in code:

w = [2,-3,-3] # assume some random weights and data
x = [-1, -2]

# forward pass
dot = w[0]*x[0] + w[1]*x[1] + w[2]
f = 1.0 / (1 + math.exp(-dot)) # sigmoid function

# backward pass through the neuron (backpropagation)
ddot = (1 - f) * f # gradient on dot variable, using the sigmoid gradient derivation
dx = [w[0] * ddot, w[1] * ddot] # backprop into x
dw = [x[0] * ddot, x[1] * ddot, 1.0 * ddot] # backprop into w
# we're done! we have the gradients on the inputs to the circuit