当前位置：网站首页>Deep learning notes -- some "cool" knowledge points of convolution layer and pooling layer

Deep learning notes -- some "cool" knowledge points of convolution layer and pooling layer

2022-07-21 03:43:00 【daimashiren】

One . Definition and calculation of convolution layer

1. Definition of convolution layer

Convolution layer = Feel the field + Weight sharing

Every feeling field ( receptive field) There will be multiple filter, And each filter The parameter settings of are different ( Generally, after random initialization , Obtained by neural network autonomous learning ). Every filter Be responsible for detecting certain features in the receptive field (pattern), The same features may be repeated in other parts of the image , Therefore, different areas of the same picture need the same group filter To detect , Weight sharing (Parameter Sharing), In order to realize the filter Efficient reuse and reduction of parameters .

2. Calculation of convolution layer

Suppose the input image is (3,6,6), namely channel = 3, H x W = 6 x 6 The color picture of , We use a set of , common 64 individual (3,3,3) Of filter, namely channel = 3,kernel size = 3 x 3 ,padding = 0 Convolution kernel , To perform convolution , The output characteristic graph (feature map ) The shape of should be determined by the following formula .

$out=\frac{Hi-Hk+2P}{S}+1$

among ,Hi、Hk Respectively represents the height of the input image and the convolution kernel ( The formula of changing height into width remains unchanged ),S Step length (stride),P Express Padding（ If there is ）. According to the example above , The substitution formula is calculated as follows ：

$Out_h=\frac{6-3}{2}+1=2.5$

$Out_w=\frac{6-3}{2}+1=2.5$

Here we deliberately take an infinite number , Because many blogs only cite divisible integer solutions , However, many of them are not integers , Looked at the pytorch Source code , The treatment of this part is Down ( Left ) integer , namely 2.5→2, Other frameworks should be similar . Because there are 64 Different convolution kernels , So the output after convolution is (64,2,2).

in addition , Another thing to note is , Everyone must know that convolution operation is actually the convolution kernel and a certain area of the image ( Feel the field ) Do matrix dot product operation (elementwise product The corresponding elements are multiplied and then added ), But do you know that the number of channels of each convolution kernel needs to match the number of channels of the input image . In fact, there is an easily ignored but very important point here , The convolution operation of each convolution kernel for a certain region of the image is carried out simultaneously in the depth direction of the image , That is to say, in the above example , When a certain region of the three channel color image is convoluted with the convolution kernel , It is the dot product of three channel image information and a three channel convolution kernel at the same time , Then take it. The convolution results of three channels are added to get the operation result of a convolution kernel , All in all 64 Convolution kernels , So the number of output channels is 64, That is, the number of channels of the output result of the convolution layer only depends on the number of convolution cores , It is independent of the number of channels of the input image .

（ notes ： The parameter settings on the three channels of the same convolution kernel are often different .)

Two . Pooling layer

About pool layer , We all know that it is used to reduce dimension , Reduced parameter quantity . But one thing you may overlook is , One big difference between pool layer and convolution layer is pool layer There are no parameters to learn , And the pool layer also Do not change the input image ( The characteristic image obtained by convolution layer can also be regarded as an image in a sense ) Of The channel number , Therefore, the number of channels is omitted when defining the number of channels , Only in size (kernel size) And step length (stride) The definition of , So in pytorch In the implementation of other frameworks, the pooling layer is also directly applied to each channel of the input image to obtain the output of the same number of channels . The calculation formula of the output image size of the pooling layer is the same as that of the convolution layer .

Reference resources

( Strong push ) Li Hongyi 2021 Spring machine learning course _ Bili, Bili _bilibili

How to understand the channel in convolutional neural network （channel）_Medlen-CSDN Blog _ Convolutional neural network channel

原网站

版权声明
本文为[daimashiren]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/202/202207200530053255.html