当前位置:网站首页>Machine learning - Integrated Learning
Machine learning - Integrated Learning
2022-07-22 13:45:00 【InfoQ】
1. What is integrated learning
- Bagging by 【Put back the sample 】Method to create a training set , To build multiple classifiers separately . Predict new data through the voting of multiple classifiers . The typical algorithm is random forest .
- Boosting by 【How to improve】, That is, constantly reduce the deviation of supervised learning to build a strong classifier , The more we go to the following weak classification, the more we pay attention to the samples where the previous weak classifier fails . Typical algorithms are AdaBoost and GBDT( Gradient iterative decision tree )
- Bagging Each training set in is not related to each other , That is, each base classifier is not related to each other , and Boosting The training set should be adjusted on the results of the previous round , It also makes it impossible to calculate in parallel
- Bagging The prediction function is uniform and equal , But in Boosting The prediction function is weighted

- Random forests
- AdaBoost
- GBDT( Gradient iterative decision tree )
2. Random forests
- (1) Every decision tree is an expert who is proficient in a narrow field ( Because we get M individual feature Choose from m Let every decision tree learn ), In this way, there are many experts who are proficient in different fields in random forest ,
- (2) On a new question ( New input data ), You can look at it from a different perspective , It's up to the experts , The result of the vote
- (3) The key parameters : n_estimators Number of decision trees , max_features The number of randomly selected features

X = iris.data
y = iris.target
from sklearn.ensemble import RandomForestClassifier
model = RandomForestClassifier(max_depth=5, n_estimators=10, max_features=1)
model.fit(X,y)
model.score(X, y)
print(model.predict([X[1,],X[2,]]))
2.Adaboost

from sklearn.ensemble import AdaBoostClassifier
X = iris.data
y = iris.target
model=AdaBoostClassifier(n_estimators=20)
model.fit(X,y)
model.score(X, y)
print(model.predict([X[100,],X[2,]]))
3.GBDT( Gradient iterative decision tree ) Algorithm

from sklearn.ensemble import GradientBoostingClassifier
X = iris.data
y = iris.target
model=GradientBoostingClassifier(n_estimators=100, learning_rate=1.0, max_depth=1,random_state=0)
model.fit(X,y)
model.score(X, y)
print(model.predict([X[100,],X[2,]]))
边栏推荐
猜你喜欢
ApacheCon Asia 2022 开启报名:Pulsar 技术议题重磅亮相
window 编译生成darknet (cuda11.1+opencv4.5+vs2019)
ApacheCon Asia 2022 开启报名:Pulsar 技术议题重磅亮相
[有趣] VS Code -- Live Server的一小段代码注入
Small knowledge points with notes
torch. jit. Trace and torch jit. Differences between scripts
第N次重装系统之win10注册表
稀土开发者大会|Apache Pulsar Committer 刘德志分享云原生技术变革之路
Leetcode-zj-future03: location of express transfer station
应用在触摸面板中的电容式触摸芯片
随机推荐
Uniapp realizes the lucky circle function of lottery
leetcode-zj-future03:快递中转站选址
Leetcode-386: number of dictionary rows
稀土开发者大会|StreamNative 翟佳、刘德志分享云原生技术变革之路
[wechat applet] image image loading (78/100)
2.信息收集概述
Alternative addition and the number of schemes of square walking
Auto. JS learning note 18: sub thread and timer are used together (an example is at the end)
Leetcode-zj-future04: store commodity allocation
Autojs微信研究:多次测试发现偶尔出现调用了click()返回了true,但实际并未点击成功的情况,例如“通讯录”(已解决)
【微信小程序】Image图片加载(78/100)
Why is varchar (255) defined in MySQL?
TiDB 高并发写入场景最佳实践
Deep and shallow copy
C语言动态内存管理
PCBA design of body fat scale
自动化测试简历编写应该注意哪方面?有哪些技巧?
leetcode-6118:最小差值平方和
一个合约能存储多少数据?
Best practices for monitoring tidb with grafana