当前位置:网站首页>Preprocessing - outlier detection
Preprocessing - outlier detection
2022-07-22 01:44:00 【Lu 727】
1、 effect
It can detect and process abnormal values contained in variable data , Outlier detection logic is a data set of variables ( Similar to column ) Determine according to the set threshold , Filter out the data falling within the detection range of outliers , Then replace the original data according to the disposal method .
2、 Input / output description
Input : Raw data column
Output : Data column after removing outliers
3、 Modeling steps
Laida criterion
The data needs to obey the normal distribution , Plus or minus 3∂ Is the probability that 99.7%, So the average distance 3∂ The probability of occurrence of values other than P(|x-u| 3∂) = 0.003, It belongs to a very small probability event . If the data does not obey the normal distribution , It can also be described by how many times the standard deviation away from the average .
IQR distinguish
Four minute spacing (IQR) Is the difference between the upper quartile and the lower quartile . And we passed IQR Of 1.5 Times the standard , Regulations : exceed ( The top quartile +1.5 times IQR distance , Or the lower quartile -1.5 times IQR distance ) The point of is the outlier .
边栏推荐
- Unity_Demo | 中世纪风3D-RPG游戏
- Database transaction isolation level
- go gorm mysql报错:Error 1292: Incorrect datetime value: ‘XXX‘ for column ‘created_at‘ at row 1
- 数据库生成Html文档
- 请问这是表示mysql的binlog已经开启吗?
- HCIP前期总结
- 多重背包问题代码模板
- 04-1. Default member function: constructor, destructor
- Group knapsack problem
- Idea running @test cannot be input from the console and is in the loading state
猜你喜欢
随机推荐
Using completable future to implement asynchronous callback
乘风破浪,金融科技时代下的数字化转型之路
I.MX6U-ALPHA开发板(蜂鸣器实验)
2022.7.20-----leetcode.1260
05-1、默认成员函数:拷贝构造函数、赋值运算符重载
【集训DAY9】Rotato【暴力】【思维】
Learning path PHP -- thinkphp5 + windows server to achieve scheduled tasks
Idea 常用插件
C language classic 100 questions (1-10 questions) (including answers)
Major breakthrough! Successful development of the first domestic scientific computing software
动态规划多重背包问题(二进制优化)
The second financial article failed
swift 【block】
【集训DAY8】【Luogu_P6335】Staza【Tarjan】
Dynamic programming multiple knapsack one dimension
I.MX6U-ALPHA开发板(按键输入实验)
联想小新air13 pro重装win10时出现找不到存储设备驱动
HCIP前期总结
2022-7-17 FTP客户端项目实现 - 总结
Multiple knapsack problem code template