当前位置:网站首页>Using fromquest in a scratch
Using fromquest in a scratch
2022-07-22 00:48:00 【Fan zhidu】
scrapy Official access method Request,post Well, that's right scrapy.FormRequest
import scrapy
import requests
class LogingitSpider(scrapy.Spider):
name = 'logingit'
allowed_domains = ['github.com']
# Login interface URL
login_url = 'https://github.com/login'
# POST Form data URL
post_url = 'https://github.com/session'
# After landing URL
logined_url = 'https://github.com/settings/profile'
def start_requests(self):
"""
Get the source code of landing page
"""
return [scrapy.Request(url=self.login_url,
callback=self.login,
headers=self.settings.get('DEFAULT_REQUEST_HEADERS'))]
def login(self, response):
"""
Use FromRequest Simulated landing Github
"""
# extract POST Verify parameters authenticity_token
authcode = response.xpath('//*[@id="login"]/form/input[2]/@value').extract_first()
if authcode:
self.logger.debug("Auth Token: %s" %authcode)
post_data = {
'commit': 'Sign in',
'utf8': '*',
'authenticity_token': authcode,
'login': self.settings.get('ACCOUNT'),
'password': self.settings.get('PASSWORD')
}
return [scrapy.FormRequest(url=self.post_url,
formdata=post_data,
headers=self.settings.get('DEFAULT_REQUEST_HEADERS'),
callback=self.check)]
else:
return [scrapy.Request(url=self.login_url, callback=self.login)]
def check(self, response):
"""
Verify whether the login is successful
"""
avatar = response.css('#user-links > li:nth-child(3) > details > summary > img::attr(src)').extract_first()
if avatar:
content = requests.get(url=avatar.split('?')[0]).content
with open('./utils/acatar.jpg', 'wb') as f:
f.write(content)
print('Successfully Login!')
pass
def parse(self, response):
pass
边栏推荐
- win10显示自动修复无法正常开机
- 如何导出异步的数据两种方法
- VMware Workstation Pro 16 installation shows "setup failed to generate the SSL keys necessary to run VMware"
- PC端口占用解除
- 【特征选择】特征选择的几种方法
- 成功安装pyinstaller(解决pip install pyinstaller安装失败问题)
- JS common methods
- In depth understanding of the return statement of go language and the execution sequence of defer and return
- 解决FTPClient上传文件为空,显示0字节
- 关于一张 5 亿数据表之我与 DBA 的 battle
猜你喜欢
TypeScript(二)
一个简洁好用的翻译程序
[natural language processing and text analysis] basic information retrieval: signature file technology, advanced information retrieval: vector space technology (technology currently used by mainstream
1. Basic concepts of machine learning
元数据驱动下的业务创新,构建企业竞争新优势
How to realize SKU screening of goods by applet
事件链、事件代理、页面的渲染过程、style的操作、防抖与节流【DOM(四)】
小程序如何使用自定义导航栏
技术的“核心引擎”
Data visualization Chapter 5
随机推荐
scrapy生成文件基本流程
Applet access incentive video
Typora的下载及MarkDown使用
[feature learning] feature learning based on deep learning and word embedding
Draw plate layer
Support vector machine --svm SVC class
机器学习K-均值——nonzero(clusterAssment[冒号,0].A==cent
【自然语言处理与文本分析】自然语言处理概要
[natural language processing and text analysis] comprehensive scanning method and item by item flipping method in basic information retrieval technology.
Dataframe counts the number of repetitions
Sparksql experiment
dataframe 统计重复次数
scrapy pipeline中mysql和mongo数据库入库
hcip第九天
Replay attack as defense
Reading notes of redis deep adventure core principles and application practice
Matlab read CSV file inside both text and numbers of the file how to read. (it can be more than CSV files, txt files, etc.)
教你破解电脑五种开机密码方法
数据与商业:数字化转型浪潮下的追逐与成就
新一代企业IT架构到底是什么?云原生?低代码?