Max Pooling LayerΒΆ
Pooling layer is another important component of convolutional neural networks. There are many ways of performing pooling in this layer, such as max pooling, average pooling, etc. In this part, we will only discuss max-pooling layer as it is used most commonly in convolutional neural networks.
In Max pooling layer, we also have a spatially small sliding window called the kernel. In the window, only the largest value will be remained and all other values will be dropped. For example, assume we have
and a \(2\times 2\) max-pooling kernel. Then the output \(C\) will be
With the given kernel size \(K_w\) and \(K_h\), We can formalize the max-pooling process as
Hence we can compute the derivative as below:
The implementation of max pooling layer in tinyml is as below:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 | from tinyml.core import Backend as np
from .base import Layer
from .convolution import col2im_indices, im2col_indices
class MaxPool2D(Layer):
'''
Perform Max pooling, i.e. select the max item in a sliding window.
'''
def __init__(self, name, input_dim, size, stride, return_index=False):
super().__init__(name)
self.type = 'MaxPool2D'
self.input_channel, self.input_height, self.input_width = input_dim
self.size = size
self.stride = stride
self.return_index = return_index
self.out_height = (self.input_height - size[0]) / stride + 1
self.out_width = (self.input_width - size[1]) / stride + 1
if not self.out_height.is_integer() or not self.out_width.is_integer():
raise Exception("[tinyml] Invalid dimension settings!")
self.out_width = int(self.out_width)
self.out_height = int(self.out_height)
self.out_dim = (self.input_channel, self.out_height, self.out_width)
def forward(self, input):
self.num_of_entries = input.shape[0]
input_reshaped = input.reshape(input.shape[0] * input.shape[1], 1,
input.shape[2], input.shape[3])
self.input_col = im2col_indices(input_reshaped,
self.size[0],
self.size[1],
padding=0,
stride=self.stride)
self.max_indices = np.argmax(self.input_col, axis=0)
self.total_count = list(range(0, self.max_indices.size))
output = self.input_col[self.max_indices, self.total_count]
output = output.reshape(self.out_height, self.out_width,
self.num_of_entries,
self.input_channel).transpose(2, 3, 0, 1)
indices = self.max_indices.reshape(self.out_height, self.out_width,
self.num_of_entries,
self.input_channel).transpose(
2, 3, 0, 1)
if self.return_index:
return output, indices
else:
return output
def backward(self, in_gradient):
gradient_col = np.zeros_like(self.input_col)
gradient_flat = in_gradient.transpose(2, 3, 0, 1).ravel()
gradient_col[self.max_indices, self.total_count] = gradient_flat
shape = (self.num_of_entries * self.input_channel, 1,
self.input_height, self.input_width)
out_gradient = col2im_indices(gradient_col,
shape,
self.size[0],
self.size[1],
padding=0,
stride=self.stride).reshape(
self.num_of_entries,
self.input_channel,
self.input_height, self.input_width)
return out_gradient
|