# Dropout LayerΒΆ

In deep neural networks, we may encounter over fitting when our network is complex and with many parameters. In Dropout: A Simple Way to Prevent Neural Networks from Overfitting, N.Srivastava et al proposed a simple technique named Dropout that could prevent overfitting. It refers to dropping out some neurons in a neural network randomly. The mechanism is equivalent to training different neural networks with different architecture in every batch.

The parameters for this layer is a preset probability $$p_0$$. It indicates the probability of dropping a neuron. For example, if $$p_0=0.5$$, then it means that every neuron in this layer has a $$0.5$$ chance of being dropped. With the given probability, we can define the dropout layer to be a function $$y_i=f(x_i)$$ such that

$\begin{split}y_i= \begin{cases} 0 & \text{r_i<p} \\ x_i & \text{r_i\geq p} \\ \end{cases}\end{split}$

, where $$r_i$$ is randomly generated. However, if we use this function, the expectations of the output of dropout layer will be scaled to $$p_0$$. For example, if the original output is $$1$$ and $$p_0=0.5$$, the output will become $$0.5$$. This is unsatifactory because when we are testing the neural networks, we do not want the output to be scaled. Thus, in practice we define the function to be

$\begin{split}y_i= \begin{cases} 0 & \text{r_i<p} \\ x_i/p & \text{r_i\geq p} \\ \end{cases}\end{split}$

Then the backward computation becomes straightforward:

$\begin{split}\frac{\partial l}{\partial x_i}= \begin{cases} 0 \times \frac{\partial l}{\partial y_i}=0 & \text{r_i<p} \\ \frac{\partial l }{\partial y_i}\times\frac{\partial y_i}{\partial x_i}=\frac{1}{p}\frac{\partial l}{\partial y_i} & \text{r_i\geq p} \\ \end{cases}\end{split}$

The implementation of dropout layer in tinyml is as below:

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 from tinyml.core import Backend as np from .base import Layer class Dropout(Layer): ''' Dropout Layer randomly drop several nodes. ''' def __init__(self, name, probability): super().__init__(name) self.probability = probability self.type = 'Dropout' def forward(self, input): self.mask = np.random.binomial(1, self.probability, size=input.shape) / self.probability return (input * self.mask).reshape(input.shape) def backward(self, in_gradient): return in_gradient * self.mask