# Max Pooling LayerΒΆ

Pooling layer is another important component of convolutional neural networks. There are many ways of performing pooling in this layer, such as max pooling, average pooling, etc. In this part, we will only discuss max-pooling layer as it is used most commonly in convolutional neural networks.

In Max pooling layer, we also have a spatially small sliding window called the kernel. In the window, only the largest value will be remained and all other values will be dropped. For example, assume we have

$\begin{split}A=\left[ {\begin{array}{*{20}c} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{array} } \right]\end{split}$

and a $$2\times 2$$ max-pooling kernel. Then the output $$C$$ will be

$\begin{split}C=\left[ {\begin{array}{*{20}c} 5 & 6 \\ 8 & 9 \\ \end{array} } \right]\end{split}$

With the given kernel size $$K_w$$ and $$K_h$$, We can formalize the max-pooling process as

$\begin{split}f(x_{ij})= \begin{cases} x_{ij} & \text{x_{ij}\geq x_{mn}, \forall m\in [i-K_w, i+K_w], n\in [j-K_h,j+K_h]} \\ 0 & \text{otherwise} \\ \end{cases}\end{split}$

Hence we can compute the derivative as below:

$\begin{split}\frac{\partial l}{\partial x_{ij}}=\frac{\partial l}{\partial f}\frac{\partial f}{\partial x_{ij}}= \begin{cases} \frac{\partial l}{\partial f} & \text{x_{ij}\geq x_{mn}, \forall m\in [i-K_w, i+K_w], n\in [j-K_h,j+K_h]} \\ 0 & \text{otherwise} \\ \end{cases}\end{split}$

The implementation of max pooling layer in tinyml is as below:

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 from tinyml.core import Backend as np from .base import Layer from .convolution import col2im_indices, im2col_indices class MaxPool2D(Layer): ''' Perform Max pooling, i.e. select the max item in a sliding window. ''' def __init__(self, name, input_dim, size, stride, return_index=False): super().__init__(name) self.type = 'MaxPool2D' self.input_channel, self.input_height, self.input_width = input_dim self.size = size self.stride = stride self.return_index = return_index self.out_height = (self.input_height - size[0]) / stride + 1 self.out_width = (self.input_width - size[1]) / stride + 1 if not self.out_height.is_integer() or not self.out_width.is_integer(): raise Exception("[tinyml] Invalid dimension settings!") self.out_width = int(self.out_width) self.out_height = int(self.out_height) self.out_dim = (self.input_channel, self.out_height, self.out_width) def forward(self, input): self.num_of_entries = input.shape[0] input_reshaped = input.reshape(input.shape[0] * input.shape[1], 1, input.shape[2], input.shape[3]) self.input_col = im2col_indices(input_reshaped, self.size[0], self.size[1], padding=0, stride=self.stride) self.max_indices = np.argmax(self.input_col, axis=0) self.total_count = list(range(0, self.max_indices.size)) output = self.input_col[self.max_indices, self.total_count] output = output.reshape(self.out_height, self.out_width, self.num_of_entries, self.input_channel).transpose(2, 3, 0, 1) indices = self.max_indices.reshape(self.out_height, self.out_width, self.num_of_entries, self.input_channel).transpose( 2, 3, 0, 1) if self.return_index: return output, indices else: return output def backward(self, in_gradient): gradient_col = np.zeros_like(self.input_col) gradient_flat = in_gradient.transpose(2, 3, 0, 1).ravel() gradient_col[self.max_indices, self.total_count] = gradient_flat shape = (self.num_of_entries * self.input_channel, 1, self.input_height, self.input_width) out_gradient = col2im_indices(gradient_col, shape, self.size[0], self.size[1], padding=0, stride=self.stride).reshape( self.num_of_entries, self.input_channel, self.input_height, self.input_width) return out_gradient