当前位置:网站首页>Processing imagenet2012 datasets
Processing imagenet2012 datasets
2022-07-21 13:01:00 【Fu_ Xingwen】
Download the imagenet2012 The dataset is
ILSVRC2012_img_train.tar and ILSVRC2012_img_val.tar These two data
First, unzip the file
mkdir train
mkdir val
tar xvf ILSVRC2012_img_train.tar -C ./train
tar xvf ILSVRC2012_img_val.tar -C ./val
about train The package , After decompression, it is still 1000 individual tar Compressed package ( Corresponding 1000 Categories ), Need to decompress again
import tarfile
import os
def un_tar(file_name):
tar = tarfile.open(file_name)
file_name = file_name.split('.')[0]
""" decompression tar"""
print(file_name)
names = tar.getnames()
temp_file_path = ''
if os.path.isdir(file_name):
print(' file already exist ')
temp_file_path = os.path.isdir(file_name)
else:
temp_file_path = os.mkdir(file_name)
print(' Create a new file name ')
# Because there are many files after decompression , Create a directory with the same name in advance
for name in names:
tar.extract(name, file_name)
tar.close()
return temp_file_path
filePath = 'train'
for tar in os.listdir(filePath):
un_tar(filePath + '/' + tar)
os.remove(filePath + '/' + tar)
Organize test sets ( need ILSVRC2012_devkit_t12.tar.gz)
from scipy import io
import os
import shutil
def move_valimg(val_dir='./val', devkit_dir='./ILSVRC2012_devkit_t12'):
""" move valimg to correspongding folders. val_id(start from 1) -> ILSVRC_ID(start from 1) -> WIND organize like: /val /n01440764 images /n01443537 images ..... """
# load synset, val ground truth and val images list
synset = io.loadmat(os.path.join(devkit_dir, 'data', 'meta.mat'))
ground_truth = open(os.path.join(devkit_dir, 'data', 'ILSVRC2012_validation_ground_truth.txt'))
lines = ground_truth.readlines()
labels = [int(line[:-1]) for line in lines]
root, _, filenames = next(os.walk(val_dir))
for filename in filenames:
# val image name -> ILSVRC ID -> WIND
val_id = int(filename.split('.')[0].split('_')[-1])
ILSVRC_ID = labels[val_id-1]
WIND = synset['synsets'][ILSVRC_ID-1][0][1][0]
print("val_id:%d, ILSVRC_ID:%d, WIND:%s" % (val_id, ILSVRC_ID, WIND))
# move val images
output_dir = os.path.join(root, WIND)
if os.path.isdir(output_dir):
pass
else:
os.mkdir(output_dir)
shutil.move(os.path.join(root, filename), os.path.join(output_dir, filename))
if __name__ == '__main__':
move_valimg()
边栏推荐
- LeetCode 练习——剑指 Offer 66. 构建乘积数组
- 绘图库Matplotlib风格和样式
- Develop a remote control software using easy language
- 什么是DTS中的数据库账号?
- [译]深入了解现代web浏览器(四)
- From cloud native to intelligent, in-depth interpretation of the industry's first "best practice map of live video technology"
- 自己搭建个人服务器的成本有多少
- 元宇宙:技术演进、产业生态与大国博弈
- Usage guide of guomi curl
- 独立搭建个人博客除了云服务器,还需要哪些技术知识?
猜你喜欢
In 2022, prepare for the golden nine silver ten, Android from infrastructure to architecture advanced all-round interview question analysis (including the answer and source code)
22张图带你深入剖析前缀、中缀、后缀表达式以及表达式求值
九章云极DataCanvas YLearn因果学习开源项目:从预测到决策
In that year, the story behind the opening of wild cattle in the Spring Festival Gala
LeetCode 練習——劍指 Offer 66. 構建乘積數組
shell基础之条件判断
AI如何做新冠疫情预测?佐治亚理工最新《以数据为中心的流行病预测》综述
This should be done in the face of medical disputes
小红书商城整店商品API接口(店铺所有商品接口)
二师兄的纪录片
随机推荐
自己搭建个人服务器的成本有多少
阿里云DTS 支持的源端数据库类型有哪些?
Xiaohongshu mall whole store commodity API interface (all store commodity interfaces)
shell基础之条件判断
选择RDS实例接入数据库方式时,需要怎么做?
[format string] the principle and utilization of format string vulnerability
What is public IP self built database?
【图像处理】Pyefd.elliptic_fourier_descriptors的使用方式
Publish a regular article
写单元测试,没你想得那么简单!
淘宝天猫京东拼多多等平台关键词监控价格API接口(店铺商品价格监控API接口代码对接展示)
【CCNA实验分享】三层交换机Vlan间路由
MySQL 啥时候用表锁,啥时候用行锁?
选择MONGODB 实例接入数据库方式,需要怎么做?
【电商运营】教你这几招,告别无效预设回复
12静态 和静态初始化 参数传值
患者死因存在争议医院未告知尸检的应承担一定责任
LeetCode 練習——劍指 Offer 66. 構建乘積數組
2022清华暑校笔记之L2_1神经网络的基本组成
Exercice leetcode - Échange de doigts 66. Construire un tableau de produits