当前位置:网站首页>shell脚本按日期范围和间隔下载数据
shell脚本按日期范围和间隔下载数据
2022-07-20 22:44:00 【lepton126】
工作背景:利用shell script和curl ,在指定的网址上下载数据,本文中指定的日期范围是2022-03-01到200-06-01,每隔三天形成一个文件,具体代码
#!/bin/bash
mycurl="curl --insecure -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9' -H 'Accept-Encoding: gzip, deflate, br' -H 'Accept-Language: zh-CN,zh;q=0.9' -H 'Cache-Control: max-age=0' -H 'Connection: keep-alive' -H 'Cookie: TH_AUTH_ONLINE=MTY1Nzg3MzIzOHxOd3dBTkZwS04wbGFTRmxWU0ZOYVExRklXVVpKUlVoQlFWTlRWelkxVWtSYVJsSlFSa1ZXTWs5U1RsWktVVk5RTTFwTVYwMURORUU9fC5OMQT5IFr5nL7LOBNyw4HwIhsXKJRI9Z_6Uzrr6abT' -H 'Host: ***.**' -H 'Sec-Fetch-Dest: document' -H 'Sec-Fetch-Mode: navigate' -H 'Sec-Fetch-Site: none' -H 'Sec-Fetch-User: ?1' -H 'Upgrade-Insecure-Requests: 1' -H 'User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36'"
startDate=20220301
endDate=20220601
startSec=`date -d "$startDate" +"%s"`
endSec=`date -d "$endDate" +"%s"`
j=0
for((i=$startSec;i<=endSec;i+=259200))
do
stepday=`date -d "@$i" "+%Y-%m-%d"`
datearray[$j]=$stepday
let j=$j+1
done
#echo ${datearray[@]}
ipprefix='192.168.103.'
for((m=0;m<${#datearray[@]}-2;m+=1))
do
start_time=${datearray[$m]}
end_time=${datearray[$m+1]}
start_time_sec=`date -d $start_time +%s`
let start_time2_sec=start_time_sec+86400
start_time2=`date -d "@$start_time2_sec" "+%Y-%m-%d"`
#echo "start_time:$start_time start_time2:$start_time2 end_time:$end_time"
for ipnum in `seq 160 175`
do
ipstr=${ipprefix}${ipnum}
echo "iptarget:${ipstr}"
if [ $start_time == '2022-03-01' ]
then
echo "curl is running,iptarget:${ipstr},start_time:${start_time},end_time:${end_time}"
curlcmd="${mycurl} -o ${ipnum}_${start_time}_${end_time}_page1.json \"https://wuji.su/api/bigdata/v1/ip/process?page=0&size=10000&start_time=${start_time}&end_time=${end_time}&condhash=&ip=${ipstr}\""
echo $curlcmd
echo $curlcmd | sh
num=`cat ${ipnum}_${start_time}_${end_time}_page1.json|grep "\"load_all\":true"|wc -l`
while [[ $num == 0 ]]
do
echo $curlcmd | sh
echo "curl is running"
num=`cat ${ipnum}_${start_time}_${end_time}_page1.json|grep "\"load_all\":true"|wc -l`
sleep 5s
done
else
echo "curl is running,iptarget:${ipstr},start_time:${start_time2},end_time:${end_time}"
curlcmd="${mycurl} -o ${ipnum}_${start_time2}_${end_time}_page1.json \"https://wuji.su/api/bigdata/v1/ip/process?page=0&size=10000&start_time=${start_time2}&end_time=${end_time}&condhash=&ip=${ipstr}\""
echo $curlcmd
echo $curlcmd | sh
num=`cat ${ipnum}_${start_time2}_${end_time}_page1.json|grep "\"load_all\":true"|wc -l`
while [[ $num == 0 ]]
do
echo $curlcmd | sh
echo "curl is running"
num=`cat ${ipnum}_${start_time2}_${end_time}_page1.json|grep "\"load_all\":true"|wc -l`
sleep 5s
done
fi
done
done
代码前半部分是将日期写入一个数组,用于计算起始和结束日期,另外
start_time=${datearray[$m]}
end_time=${datearray[$m+1]}
start_time_sec=`date -d $start_time +%s`
let start_time2_sec=start_time_sec+86400
start_time2=`date -d "@$start_time2_sec" "+%Y-%m-%d"`
这段代码是计算日期间隔,本文中为三天
边栏推荐
猜你喜欢
GAMES101图形学P12笔记(geometry3)
The top three suddenly changed, revealing the latest ranking of programming languages in July
Okaleido tiger NFT即将登录Binance NFT平台,你期待吗?
经典自动化面试题
2018 USBASP burner general version tutorial
Fiddler set breakpoint
Correlation analysis and SPSS software operation
1744. Can you eat your favorite candy on your favorite day?
1190. 反转每对括号间的子串
ython + Selenium Web自动化 2022更新版教程 自动化测试 软件测试 爬虫-笔记博客整理
随机推荐
Listen for attribute value changes in swift (observer mode)
【NOI2020】制作菜品(构造,结论,背包DP,bitset优化)
Redis内存模型讲解
LNMP------PHP7安装
Four redis cluster schemes you must know and their advantages and disadvantages
It centralized purchase of Postal Savings Bank of China: servers, arrays, switches, routers, firewalls, etc
kube-Controller Manager 原理
我来告诉你,一个草根程序员如何逆袭,成功进入BAT!
如何从零开始学习自动化测试?
Use of oil monkey plug-in
Introduction software testing tips
BCG attribute list
Codeforces Round #808 (Div. 2)
Games101 graphics P10 notes (geometry1)
字节顺序-大端/小端、big-endian/little-endian
692. 前K个高频单词
Multi active architecture design of wangzhe glory Mall (module 7 of architecture practice camp)
692. Top k high-frequency words
2018版USBASP烧录器改通用版教程
【方法】判断exe或dll是32位还是64位