当前位置:网站首页>Detailed explanation of UNET (with graphics and code implementation)
Detailed explanation of UNET (with graphics and code implementation)
2022-07-22 08:13:00 【Full stack programmer webmaster】
Hello everyone , I meet you again , I'm your friend, Quan Jun .
Convolutional neural networks are widely used in classification tasks , The output is the class label of the whole image . however UNet Pixel level classification , The output is the category of each pixel , And different categories of pixels will display different colors ,UNet It is often used in biomedical images , And there is often less picture data in this task . therefore ,Ciresan Et al. Trained a convolutional neural network , Provide the surrounding area of the pixel with a sliding window (patch) The class label of each pixel is predicted as input . This network has two advantages :(1) The output results can locate the location of the target category ;(2) Because the input training data is patches, This is equivalent to data enhancement , Thus, the problem of small number of biomedical images is solved .
however , The neural network using this method also has two obvious disadvantages :(1) It's very slow , Because this network must train everyone patch, And because patch There is a lot of redundancy in the overlap between , This will cause the same characteristics to be trained many times , Waste of resources , As a result, the training time will be longer and the efficiency will be reduced , Some people will also ask that after many times of training this feature of neural network , Will deepen the impression of this feature , Thus the accuracy will also increase , But for example, a picture copy 50 Zhang , Use this 50 Picture to train the network , Although the data set has increased , But the result is that the neural network will have fitting , That is to say, neural networks are familiar with training pictures , But I changed a picture , The neural network may not be able to distinguish .(2) Location accuracy and contextual information cannot be achieved at the same time , Big patches More is needed max-pooling, This will reduce the positioning accuracy , Because maximum pooling will lose the spatial relationship between the target pixel and the surrounding pixels , And small patches Only small local information can be seen , The background information contained is not enough .
UNet The main contribution is in U Type structure , This structure can make it use less training pictures at the same time , And the accuracy of segmentation will not be poor ,UNet The network structure is shown in the figure below :
(1)UNet Using the full convolution neural network . (2) The network on the left is the feature extraction network : Use conv and pooling (3) The network on the right is the feature fusion network : Use the feature map generated by up sampling to compare with the left feature map concatenate operation .(pooling Layers lose image information and reduce image resolution and are permanent , It has some influence on the task of image segmentation , It has little impact on image classification tasks , Why do we need to take samples ? Upsampling allows low resolution images containing advanced abstract features to become high-resolution while retaining advanced abstract features , Then compare it with the low-level surface feature high-resolution image on the left concatenate operation ) (4) Finally, after two convolution operations , Generate feature map , Then two convolution kernels with the size of 1*1 Convolution of to get the last two heatmap, For example, the first one represents the score of the first category , The second one represents the score of the second category heatmap, And then as a softmax Input to function , Calculate the one with high probability softmax, And then we can move on loss, Back propagation calculation .
Unet Code implementation of the model ( be based on keras):
def get_unet():
inputs = Input((img_rows, img_cols, 1))
conv1 = Conv2D(32, (3, 3), activation='relu', padding='same')(inputs)
conv1 = Conv2D(32, (3, 3), activation='relu', padding='same')(conv1)
pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)
# pool1 = Dropout(0.25)(pool1)
# pool1 = BatchNormalization()(pool1)
conv2 = Conv2D(64, (3, 3), activation='relu', padding='same')(pool1)
conv2 = Conv2D(64, (3, 3), activation='relu', padding='same')(conv2)
pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)
# pool2 = Dropout(0.5)(pool2)
# pool2 = BatchNormalization()(pool2)
conv3 = Conv2D(128, (3, 3), activation='relu', padding='same')(pool2)
conv3 = Conv2D(128, (3, 3), activation='relu', padding='same')(conv3)
pool3 = MaxPooling2D(pool_size=(2, 2))(conv3)
# pool3 = Dropout(0.5)(pool3)
# pool3 = BatchNormalization()(pool3)
conv4 = Conv2D(256, (3, 3), activation='relu', padding='same')(pool3)
conv4 = Conv2D(256, (3, 3), activation='relu', padding='same')(conv4)
pool4 = MaxPooling2D(pool_size=(2, 2))(conv4)
# pool4 = Dropout(0.5)(pool4)
# pool4 = BatchNormalization()(pool4)
conv5 = Conv2D(512, (3, 3), activation='relu', padding='same')(pool4)
conv5 = Conv2D(512, (3, 3), activation='relu', padding='same')(conv5)
up6 = concatenate([Conv2DTranspose(256, (2, 2), strides=(
2, 2), padding='same')(conv5), conv4], axis=3)
# up6 = Dropout(0.5)(up6)
# up6 = BatchNormalization()(up6)
conv6 = Conv2D(256, (3, 3), activation='relu', padding='same')(up6)
conv6 = Conv2D(256, (3, 3), activation='relu', padding='same')(conv6)
up7 = concatenate([Conv2DTranspose(128, (2, 2), strides=(
2, 2), padding='same')(conv6), conv3], axis=3)
# up7 = Dropout(0.5)(up7)
# up7 = BatchNormalization()(up7)
conv7 = Conv2D(128, (3, 3), activation='relu', padding='same')(up7)
conv7 = Conv2D(128, (3, 3), activation='relu', padding='same')(conv7)
up8 = concatenate([Conv2DTranspose(64, (2, 2), strides=(
2, 2), padding='same')(conv7), conv2], axis=3)
# up8 = Dropout(0.5)(up8)
# up8 = BatchNormalization()(up8)
conv8 = Conv2D(64, (3, 3), activation='relu', padding='same')(up8)
conv8 = Conv2D(64, (3, 3), activation='relu', padding='same')(conv8)
up9 = concatenate([Conv2DTranspose(32, (2, 2), strides=(
2, 2), padding='same')(conv8), conv1], axis=3)
# up9 = Dropout(0.5)(up9)
# up9 = BatchNormalization()(up9)
conv9 = Conv2D(32, (3, 3), activation='relu', padding='same')(up9)
conv9 = Conv2D(32, (3, 3), activation='relu', padding='same')(conv9)
# conv9 = Dropout(0.5)(conv9)
conv10 = Conv2D(1, (1, 1), activation='sigmoid')(conv9)
model = Model(inputs=[inputs], outputs=[conv10])
model.compile(optimizer=Adam(lr=1e-5),
loss=dice_coef_loss, metrics=[dice_coef])
return model
Publisher : Full stack programmer stack length , Reprint please indicate the source :https://javaforall.cn/124855.html Link to the original text :https://javaforall.cn
边栏推荐
- cmake使用boost静态库,错误提示找Could NOT find Boost (missing: system thread filesystem
- 解密“海莲花”组织(域控检测防御)
- 如何设置抓包手机端
- NFTFi赛道版图概览
- Geek planet ByteDance one stop data governance solution and platform architecture
- DS二叉树——二叉树之父子结点
- vector介绍及底层原理
- R语言使用lm函数构建多元回归模型(Multiple Linear Regression)、构建没有截距项的回归模型(模型不包含截距)
- 测试面试过程中遇到的问题整理
- jmter---数据库性能测试
猜你喜欢
Wechat payment native (I) preparation and related knowledge
Android-第十三节03SQLite数据库详解
极客星球丨字节跳动一站式数据治理解决方案及平台架构
解密“海莲花”组织(域控检测防御)
Can I win PMP in 20 days?
活动报名:如何零基础快速上手开源的 Tapdata Live Data Platform?
WebSockets 和 Server-Sent Events
强连通分量
【案例设计】事件分发器 — 实现跨类的事件响应思路分享与实现
Comment changer la police de la console en console?
随机推荐
How to design product MVP to maximize value
迪拜推出国家元宇宙战略
Interface document evolution atlas, some ancient interface document tools, you may not have used them
Official website Collection
Comment changer la police de la console en console?
特征选择小结:过滤式、包裹式、嵌入式
两个元素的矩阵乘除法「建议收藏」
看完这个,还不会DVMA,请你吃瓜
Three processes of PMP preparation in 2022 -- Challenge (1)
DS图—最小生成树
R语言使用lm函数构建多元回归模型(Multiple Linear Regression)、构建没有截距项的回归模型(模型不包含截距)
Explain pytorch visualization tool visdom in detail (I)
性能测试---分析需求
Sorting out the problems encountered during the test interview
Data sharing | simple temperature and humidity detection simulation based on SHT11
文件上传下载与Excel、数据表数据之间的转换
使用OpenCv+Arduino实现挂机自动打怪
JMeter --- FTP performance test
R language uses the ggarrange function of ggpubr package to combine multiple images, and uses the ggexport function to save the visual image in TIFF format (width parameter specifies width, height par
ROS机械臂 Movelt 学习笔记1 | 基础准备