当前位置：网站首页>Explain the usage of pytorch fold and unfold in detail

Explain the usage of pytorch fold and unfold in detail

2022-07-21 03:44:00 【daimashiren】

Come to the conclusion first ,conv = unfold + matmul + fold. That is, convolution operation is equivalent to , First unfold( an ), Then perform matrix multiplication matmul, And then again fold( Fold ). The specific process is as follows ：

unfold The function takes an input Tensor(N,C,H,W) Unfold into (N,C * K1 * K2, Blocks), among kernel Shape is (K1,K2), The total Block The number of Blocks. That is, put the input Tensor according to kernel Expand the size of into Blocks Vector .Block The calculation formula of is as follows ：
$\text H_{blocks} \times W_{blocks}$
among :
$H_{blocks} = \frac {H+2*padding[0]-kernel[0]}{stride[0]}+1$

$W_{blocks} = \frac {W+2*padding[1]-kernel[1]}{stride[1]}+1$

The code for ：

inp = torch.randn(1, 3, 10, 12)
w = torch.randn(2, 3, 4, 5)
inp_unf = torch.nn.functional.unfold(inp, (4, 5))#shape of inp_unf is (1,3*4*5,7*8)

among ,inp_unf Of shape The calculation process is as follows
$H_{blocks} = \frac {10-4}{1}+1 = 7$

$W_{blocks} = \frac {12-5}{1}+1 = 8$

out_unf = inp_unf.transpose(1, 2).matmul(w.view(w.size(0), -1).t()).transpose(1, 2)
#shape of out_unf is (1,2,56)

The above code is equivalent to inp_unf(1, 60, 56) .t() * w(2 , 3 * 4 * 5).t() → out_unf (1, 56, 2 ) → out_unf (1, 2, 56)

unfold + matmul Already completed , And finally fold The process . fold The process is actually unfold The reverse process of , That is, fold the vector back to matrix form .

out = torch.nn.functional.fold(out_unf, (7, 8), (1, 1))
#out.size() = (1,2,7,8)

The above process is actually equivalent to direct Conv, therefore

(torch.nn.functional.conv2d(inp, w) - out).abs().max()
#tensor(1.9073e-06)

You can see the result and process of convolution unfold + matmul + fold The result gap is 10 Of -6 Power , It can almost be considered equal .

summary

utilize pytorch in fold and unfold The combination of can achieve similar Conv Sliding window for operation , If each of the same picture block The parameters of are the same , Then it is called parameter sharing , Is the standard convolution layer ; If each block The parameters of are different , Then it is not parameter sharing , This is generally called local connection layer (Local connected layer).