当前位置:网站首页>Pytorch Huawei cloud cup garbage classification summary (target detection)
Pytorch Huawei cloud cup garbage classification summary (target detection)
2022-07-20 08:14:00 【Visual feast】
Abstract
Our team name is chongchong ,A Eighth in the list ,B Top of the list 11, No. in the finals 16, There is a problem with the final server score and it happens that our model speed is increasing 200 In the case of multiple photos, it timed out again , Directly out . Next, I'll share some common points . Data download link link :https://pan.baidu.com/s/1aCh_fIVsRBjKXQkFIpdvRQ
Extraction code :1234
baseline
The top 20 are basically mmdetection. Essential for competition , It contains many algorithms , at present 2.0 Version mainly uses cascade——rcnn series ,
The first one is res2net Front end network , We didn't use this network at that time ,res2net More points on the basis of 70 branch , What we use cascade——rfp Of resnet50 As a front-end network , The basic score is only 67 about , The basic score is much lower . Use senet154 The basic points are 75. The scores of different front-end networks vary greatly , It needs to be tested before training , Find the best baseline, Use senet154 The leading score of is very high, but it is much more difficult to improve the score , The fourth player said that there was no increase in multi-scale plus . Many strategies will not rise after use , It's a little difficult to get the upper score .
So learn to use mmdetection, And being able to flexibly use the front-end network is the key . Find the right network and start ahead of others . plus trick It won't be so hard .
trick
At present commonly used mixup and mosic. and mmdet It uses albu The various self-contained enhancements need to be combined . Different trick It could have a different impact , For example, I'm the most trained mixup and cutout At the same time, the use effect is greatly reduced . Individual tests can improve 2 A little bit . And category balance , Label smoothing . The category balance is mmdet There is a kind of offline enhancement with yourself , Expand those with few categories by random transformation . The code in the first place uses atss feature extraction . It's a combination of atss The following code constructs a new network by itself . Need to be right mmdet Only when you are very familiar with the coding of .mmdet Each network is written as a component , So it takes a lot of effort to learn the source code . The final point is the machine configuration . There are some trick Plus, it's hard to converge . So sometimes you need to pre train the weight in coco On dataset ( Can't afford to play ). Or using learning rate annealing is generally better , perhaps mmdet Self contained distributed training .
Part of the code
Mold Trim , Uploading files only requires parameter information
import torch
yuan=torch.load('./rs_cut_mix_pafpn_box.pth')
new = {
'meta': yuan['meta'],'state_dict': yuan['state_dict']}
torch.save(new,'./rs_cu4_4_min.pth')
coco_detection
# dataset settings
dataset_type = 'VOCDataset'
data_root = '/home/jmy/hjc/code/rubbish_classification/datasets/VOCdevkit/'
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
# TODO: augmentation from github
albu_train_transforms = [
dict(
type='MotionBlur',
blur_limit=(3, 7),
p=0.2),
#add two albu
dict(
type='ShiftScaleRotate',
shift_limit=0.0625,
scale_limit=0.0,
rotate_limit=[-10, 10],
interpolation=1,
p=0.5),
]
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations', with_bbox=True),
# TODO: augmentation from github
dict(type='Mixup', prob=0.4, lambd=0.5, mixup=True,
trainval_path='/home/jmy/hjc/code/rubbish_classification/mmdetection/augmentation_zx/ap_05_id_in_all.txt',
#trainval_path='/home/jmy/hjc/code/rubbish_classification/mmdetection/augmentation_zx/ap_05_id_in_trainset.txt',
img_path='/home/jmy/hjc/code/rubbish_classification/datasets/VOCdevkit/VOC2007/JPEGImages',
annotation_path='/home/jmy/hjc/code/rubbish_classification/datasets/VOCdevkit/VOC2007/Annotations'),
#dict(type='Resize', img_scale=(800, 480), keep_ratio=True),
dict(type='Resize', img_scale=[(800, 600), (800, 360)], keep_ratio=True, multiscale_mode='range'),
dict(type='RandomFlip', flip_ratio=0.5),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32),
# TODO: augmentation from github
dict(
type='Albu',
transforms=albu_train_transforms,
bbox_params=dict(
type='BboxParams',
format='pascal_voc',
label_fields=['gt_labels'],
min_visibility=0.0,
filter_lost_elements=True),
keymap={
'img': 'image',
'gt_masks': 'masks',
'gt_bboxes': 'bboxes'
},
update_pad_shape=False,
skip_img_without_anno=True),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
]
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
#img_scale=(1000, 600),
img_scale=(800, 480),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img']),
])
]
data = dict(
#The default learning rate in config files is for 8 GPUs and 2 img/gpu (batch size = 8*2 = 16)
#e.g., lr=0.01 for 4 GPUs * 2 img/gpu and lr=0.08 for 16 GPUs * 4 img/gpu.
# samples_per_gpu=2,
# workers_per_gpu=2,
samples_per_gpu=4,
workers_per_gpu=0,
train=dict(
type='RepeatDataset',
#hjc:The VOC dataset uses 3 times the size of dataset during training
#times=3,
times=1,
dataset=dict(
type=dataset_type,
# ann_file=[
# data_root + 'VOC2007/ImageSets/Main/trainval.txt',
# #data_root + 'VOC2012/ImageSets/Main/trainval.txt'
# ],
ann_file=[
data_root + 'VOC2007/ImageSets/Main/train.txt'
#'/home/jmy/hjc/code/rubbish_classification/mmdetection/data/lowap2000_grid/VOCdevkit/VOC2007/ImageSets/Main/train.txt'
],
img_prefix=[data_root + 'VOC2007/', data_root + 'VOC2012/'],
#img_prefix=[data_root + 'VOC2007/', '/home/jmy/hjc/code/rubbish_classification/mmdetection/data/lowap2000_grid/VOCdevkit/VOC2007'],
pipeline=train_pipeline)),
val=dict(
type=dataset_type,
# ann_file=data_root + 'VOC2007/ImageSets/Main/test.txt',
ann_file=data_root + 'VOC2007/ImageSets/Main/val.txt',
img_prefix=data_root + 'VOC2007/',
pipeline=test_pipeline),
test=dict(
type=dataset_type,
#ann_file=data_root + 'VOC2007/ImageSets/Main/trainval.txt',
ann_file=data_root + 'VOC2007/ImageSets/Main/val.txt',
img_prefix=data_root + 'VOC2007/',
pipeline=test_pipeline))
evaluation = dict(interval=1, metric='mAP')
cascade_rcnn_r50_fpn
# model settings
model = dict(
type='CascadeRCNN',
pretrained='torchvision://resnet50',
backbone=dict(
type='ResNet',
depth=50,
num_stages=4,
out_indices=(0, 1, 2, 3),
frozen_stages=1,
norm_cfg=dict(type='BN', requires_grad=True),
norm_eval=True,
style='pytorch'),
neck=dict(
type='FPN',
in_channels=[256, 512, 1024, 2048],
out_channels=256,
num_outs=5),
rpn_head=dict(
type='RPNHead',
in_channels=256,
feat_channels=256,
anchor_generator=dict(
type='AnchorGenerator',
#scales=[8],
scales=[7],
ratios=[0.5, 1.0, 2.0],
#ratios=[0.2, 0.4, 0.5, 0.6, 0.75, 17/20, 1.0, 20/17, 4/3, 5/3, 2.0, 2.5, 5.0],
strides=[4, 8, 16, 32, 64]),
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[.0, .0, .0, .0],
target_stds=[1.0, 1.0, 1.0, 1.0]),
loss_cls=dict(
type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
loss_bbox=dict(type='SmoothL1Loss', beta=1.0 / 9.0, loss_weight=1.0)),
roi_head=dict(
type='CascadeRoIHead',
num_stages=3,
stage_loss_weights=[1, 0.5, 0.25],
bbox_roi_extractor=dict(
type='SingleRoIExtractor',
roi_layer=dict(type='RoIAlign', out_size=7, sample_num=0),
out_channels=256,
featmap_strides=[4, 8, 16, 32]),
bbox_head=[
dict(
type='Shared2FCBBoxHead',
in_channels=256,
fc_out_channels=1024,
roi_feat_size=7,
#num_classes=80,
#V2.0 donot need N + 1 classes count. just N
num_classes=44,
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0., 0., 0., 0.],
target_stds=[0.1, 0.1, 0.2, 0.2]),
reg_class_agnostic=True,
loss_cls=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=1.0),
loss_bbox=dict(type='SmoothL1Loss', beta=1.0,
loss_weight=1.0)),
dict(
type='Shared2FCBBoxHead',
in_channels=256,
fc_out_channels=1024,
roi_feat_size=7,
#num_classes=80,
num_classes=44,
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0., 0., 0., 0.],
target_stds=[0.05, 0.05, 0.1, 0.1]),
reg_class_agnostic=True,
loss_cls=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=1.0),
loss_bbox=dict(type='SmoothL1Loss', beta=1.0,
loss_weight=1.0)),
dict(
type='Shared2FCBBoxHead',
in_channels=256,
fc_out_channels=1024,
roi_feat_size=7,
#num_classes=80,
num_classes=44,
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0., 0., 0., 0.],
target_stds=[0.033, 0.033, 0.067, 0.067]),
reg_class_agnostic=True,
loss_cls=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=1.0),
loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0))
]))
# model training and testing settings
train_cfg = dict(
rpn=dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.7,
neg_iou_thr=0.3,
min_pos_iou=0.3,
match_low_quality=True,
ignore_iof_thr=-1),
sampler=dict(
type='RandomSampler',
num=256,
pos_fraction=0.5,
neg_pos_ub=-1,
add_gt_as_proposals=False),
allowed_border=0,
pos_weight=-1,
debug=False),
rpn_proposal=dict(
nms_across_levels=False,
nms_pre=2000,
nms_post=2000,
max_num=2000,
nms_thr=0.7,
min_bbox_size=0),
rcnn=[
dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.5,
neg_iou_thr=0.5,
min_pos_iou=0.5,
match_low_quality=False,
ignore_iof_thr=-1),
sampler=dict(
type='RandomSampler',
num=512,
pos_fraction=0.25,
neg_pos_ub=-1,
add_gt_as_proposals=True),
pos_weight=-1,
debug=False),
dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.6,
neg_iou_thr=0.6,
min_pos_iou=0.6,
match_low_quality=False,
ignore_iof_thr=-1),
sampler=dict(
type='RandomSampler',
num=512,
pos_fraction=0.25,
neg_pos_ub=-1,
add_gt_as_proposals=True),
pos_weight=-1,
debug=False),
dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.7,
neg_iou_thr=0.7,
min_pos_iou=0.7,
match_low_quality=False,
ignore_iof_thr=-1),
sampler=dict(
type='RandomSampler',
num=512,
pos_fraction=0.25,
neg_pos_ub=-1,
add_gt_as_proposals=True),
pos_weight=-1,
debug=False)
])
test_cfg = dict(
rpn=dict(
nms_across_levels=False,
nms_pre=1000,
nms_post=1000,
max_num=1000,
nms_thr=0.7,
min_bbox_size=0),
rcnn=dict(
score_thr=0.05, nms=dict(type='nms', iou_thr=0.5), max_per_img=100))
#score_thr=0.001, nms=dict(type='nms', iou_thr=0.5), max_per_img=100))
schedule_lr
# optimizer
#optimizer = dict(type='SGD', lr=0.02, momentum=0.9, weight_decay=0.0001)
##note!:lr = 0.00125*batch_size
optimizer = dict(type='SGD', lr=0.005, momentum=0.9, weight_decay=0.0001)
# optimizer_config = dict(grad_clip=None)
optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2))
#learning policy
lr_config = dict(
policy='step',
warmup='linear',
warmup_iters=500,
#warmup_iters=4000,
warmup_ratio=0.001,
#step=[8, 11])
step=[9, 12])
total_epochs = 14
default_runtime
checkpoint_config = dict(interval=1)
# yapf:disable
log_config = dict(
#interval=50,
interval=50,
hooks=[
dict(type='TextLoggerHook'),
# dict(type='TensorboardLoggerHook')
])
# yapf:enable
dist_params = dict(backend='nccl')
log_level = 'INFO'
#load_from = None
#load_from = '/home/jmy/hjc/code/rubbish_classification/mmdetection/checkpoints/cascade_rcnn/cascade_rcnn_r50_fpn_1x_coco_20200316-3dc56deb.pth'
#load_from = '/home/jmy/hjc/code/rubbish_classification/mmdetection/checkpoints/cascade_rcnn/cascade_rcnn_r50_coco_pretrained_weights_classes_45.pth'
#load_from = '/home/jmy/hjc/code/rubbish_classification/mmdetection/checkpoints/cascade_rcnn/cascade_rcnn_r101_coco_pretrained_weights_classes_45.pth'
load_from = '/home/jmy/hjc/code/rubbish_classification/mmdetection/checkpoints/cascade_rcnn/cascade_rcnn_x101_32x4d_coco_pretrained_weights_classes_45.pth'
resume_from = None
workflow = [('train', 1)]
summary
This is the early part of the code , Mainly as a novice learning point . More strategies and data enhancement are done by yourself mmdet Write your own code . The most important thing in the competition is to have good enough equipment . And master most mmdet The source code of , As far as our team is concerned . There is 10 A card is almost . Top , Some devices are 64 Zhang card . perhaps 8 Zhang 32 The memory v100, Enough support for training coco Pre training weight , One model can be used in three days . More importantly, you should learn to change the code yourself . utilize mmdet Modify the components provided . You can watch it yourself configs All the files under .
边栏推荐
猜你喜欢
PC website realizes wechat code scanning login function (I)
WinForm layout and control adaptive resolution and prevention of dislocation
如何调试 C# Emit 生成的动态代码?
Rlib learning [2] --env definition + env rollout
JS 99 multiplication table
How to customize Net6.0 logging
Summary of interview questions (4) TCP / IP four-layer model, three handshakes and four waves, one more and one less. No, the implementation principle of NiO
基于.NET动态编译技术实现任意代码执行
Pytorch 目标分类比赛入门
opencv图片处理之---------环境安装配置
随机推荐
c语言-链表创建-合并-删除等复试常见操作
SQL 时间拼接问题,系统自动截断的拼接复原
Rlib learning [2] --env definition + env rollout
JWT(JSON Web Token)的基础使用
HCIP --- 重发布
MySQL ten million level sub table optimization
three. JS endless pipeline Perspective
Comparative analysis of single sign on SSO of JWT, CAS, oauth2 and SAML
It's too voluminous. A company has completely opened its core system (smart system) that has been operating for many years
China ambroxol market forecast and investment strategy report (2022 Edition)
China carbon carbon composite market research and investment forecast report (2022 Edition)
three.js打造失真滑块
如何定制.NET6.0的日志记录
Leetcode 199 Right view of binary tree (2022.07.18)
2021水下声学目标检测总结-Rank2
Opencv image processing --------- environment installation configuration
. Net full scene development has finally arrived
YOLOv5改进之二十一:CNN+Transformer——主干网络替换为又快又强的轻量化主干EfficientFormer
SpiderPi便捷操作手册
Improvement 21 of yolov5: cnn+transformer - replace the backbone network with fast and strong lightweight backbone efficientformer