Why Pooling ?
Convolution Network increase the feature map space by many many folds. This complicates the model learning. A efficient way is required to effectively reduce the feature space . Pooling is exactly that technique.
Pooling is much more better than larger stride size as it removes a lot of information.
Different Types of Pooling Possible
- Max Pooling : In max pooling we take the max value of the stride.
Adv : a. Parameter-Free, hence no over-fitting risk
b. Often more accurate
Disadv : a. Expensive training, since model runs on lower stride
b. More Hyper-parameters i.e pooling stride and pooling size
Fig Max Pooling REF: Udacity
- Average Pooling : Instead of using the max, we use the average of the pixels in the stride. Its almost equivalent to providing a lowered resolution blurred image of the picture, since we are taking the average.
Typical Convolution Network Architecture:
A very typical convolution network architecture is a very few layer alternating convolution network and pooling followed by few fully connected layers at the top. Lev-net is the first image recognition architecture developed at 1998 and the Alex-net in the fig below is the prize winning image recognition architecture.