当前位置:网站首页>The data of pandas was scrambled and the training machine and testing machine set were selected
The data of pandas was scrambled and the training machine and testing machine set were selected
2020-11-06 01:27:00 【Elementary school students in IT field】
describe
In machine learning , To get a pile of training data, we usually need to divide the data into training set and test set , Or cut it into training sets 、 Cross validation sets and test sets , In order to avoid bias in feature distribution of the segmented dataset , We need to scramble the data first , Make the data random , And then it's cutting .
The methods to be used are as follows :
notes :df Representing one pd.DataFrame
df = df.sample(frac=1.0): Press 100% The proportion of sampling is to achieve the effect of disrupting data
df = df.reset_index(): After scrambling the data index It's also messy , If your index If there is no characteristic meaning , Just reset it , Otherwise, we will put index Add a new column , Generate meaningless index
train = df.loc[0:a]: Carry out segmentation operation , The proportion depends on the situation
cv = df.loc[a+1:b]:
test = df.loc[b+1:-1]:
版权声明
本文为[Elementary school students in IT field]所创,转载请带上原文链接,感谢
边栏推荐
- Analysis of react high order components
- xmppmini 專案詳解:一步一步從原理跟我學實用 xmpp 技術開發 4.字串解碼祕笈與訊息包
- 比特币一度突破14000美元,即将面临美国大选考验
- Tool class under JUC package, its name is locksupport! Did you make it?
- Wechat applet: prevent multiple click jump (function throttling)
- ES6学习笔记(四):教你轻松搞懂ES6的新增语法
- This article will introduce you to jest unit test
- 每个前端工程师都应该懂的前端性能优化总结:
- Elasticsearch 第六篇:聚合統計查詢
- 一篇文章带你了解CSS3圆角知识
猜你喜欢
axios学习笔记(二):轻松弄懂XHR的使用及如何封装简易axios
Filecoin最新动态 完成重大升级 已实现四大项目进展!
Mongodb (from 0 to 1), 11 days mongodb primary to intermediate advanced secret
CCR炒币机器人:“比特币”数字货币的大佬,你不得不了解的知识
华为云“四个可靠”的方法论
I'm afraid that the spread sequence calculation of arbitrage strategy is not as simple as you think
Face to face Manual Chapter 16: explanation and implementation of fair lock of code peasant association lock and reentrantlock
Brief introduction of TF flags
钻石标准--Diamond Standard
阿里云Q2营收破纪录背后,云的打开方式正在重塑
随机推荐
xmppmini 專案詳解:一步一步從原理跟我學實用 xmpp 技術開發 4.字串解碼祕笈與訊息包
ES6学习笔记(四):教你轻松搞懂ES6的新增语法
Group count - word length
High availability cluster deployment of jumpserver: (6) deployment of SSH agent module Koko and implementation of system service management
熬夜总结了报表自动化、数据可视化和挖掘的要点,和你想的不一样
Linked blocking Queue Analysis of blocking queue
中小微企业选择共享办公室怎么样?
Thoughts on interview of Ali CCO project team
快快使用ModelArts,零基礎小白也能玩轉AI!
6.6.1 localeresolver internationalization parser (1) (in-depth analysis of SSM and project practice)
Skywalking series blog 5-apm-customize-enhance-plugin
EOS创始人BM: UE,UBI,URI有什么区别?
The difference between Es5 class and ES6 class
How to use parameters in ES6
前端都应懂的入门基础-github基础
PN8162 20W PD快充芯片,PD快充充电器方案
What is the difference between data scientists and machine learning engineers? - kdnuggets
Computer TCP / IP interview 10 even asked, how many can you withstand?
速看!互联网、电商离线大数据分析最佳实践!(附网盘链接)
Network security engineer Demo: the original * * is to get your computer administrator rights! 【***】