综合技术

Deep Learning from Scratch to GPU, Part 2: Bias and Activation Function

微信扫一扫,分享到朋友圈

Deep Learning from Scratch to GPU, Part 2: Bias and Activation Function
0 0

In the current state, the network combines all layers into a single linear transformation
.
We can introduce basic decision-making capability by adding a cutoff to the output of each neuron.
When the weighted sums of its inputs are below that threshold, the output is zero
and when they are above, the output is one.

begin{equation} output = left{ begin{array}{ll} 0 & Wmathbf{x} leq threshold \ 1 & Wmathbf{x} > threshold \ end{array} right. end{equation}

Since we keep the current outputs in a (potentially) huge vector, it would be inconvenient to
write a scalar-based logic for that. I prefer to use a vectorized function, or create one if
there is not exactly what we need.

Neanderthal does not have the exact cutoff function, but we can create one by subtracting threshold
from the maximum of each threshold and the signal value and then mapping the signum function to the
result. There are simpler ways to compute this, but I wanted to use the existing functions,
and do the computation in-place. It is of purely educational value, anyway. We will see soon
that there are better things to use for transforming the output than the vanilla step function.

(defn step! [threshold x]
  (fmap! signum (axpy! -1.0 threshold (fmax! threshold x x))))
(let [threshold (dv [1 2 3])
      x (dv [0 2 7])]
  (step! threshold x))
nil#RealBlockVector[double, n:3, offset: 0, stride:1]
[   0.00    0.00    1.00 ]

I'm going to show you a few steps in the evolution of the code, so I will
reuse weights and x. To simplify the example, we will use global def
and not care about properly releasing the memory.
It will not matter in a REPL session, but not forget to do it in the real code.
Continuing the example fromPart 1:

(def x (dv 0.3 0.9))
(def w1 (dge 4 2 [0.3 0.6
                  0.1 2.0
                  0.9 3.7
                  0.0 1.0]
             {:layout :row}))
(def threshold (dv 0.7 0.2 1.1 2))

Since we do not care about extra instances at the moment, we'll use the pure
mv

function instead of
mv!

for convenience. mv
creates the resulting vector y
,
instead of mutating the one that has to be provided as an argument.

(step! threshold (mv w1 x))
nil#RealBlockVector[double, n:4, offset: 0, stride:1]
[   0.00    1.00    1.00    0.00 ]

The bias is simply the threshold moved to the left side of the equation:

begin{equation} output = left{ begin{array}{ll} 0 & Wmathbf{x} - bias leq 0 \ 1 & Wmathbf{x} - bias > 0 \ end{array} right. end{equation}

(def bias (dv 0.7 0.2 1.1 2))
(def zero (dv 4))
(step! zero (axpy! -1.0 bias (mv w1 x)))
nil#RealBlockVector[double, n:4, offset: 0, stride:1]
[   0.00    1.00    1.00    0.00 ]

Remember that bias is the same as threshold. There is no need for the extra
zero vector.

(step! bias (mv w1 x))
nil#RealBlockVector[double, n:4, offset: 0, stride:1]
[   0.00    1.00    1.00    0.00 ]
Lobsters
感谢您的支持!

    央媒:猪年春节凸显新亮点 三四线城市“消费逆袭”

    上一篇

    McCarthy's Ambiguous Operator

    下一篇

    您也可能喜欢

    评论已经被关闭。

    插入图片