当前位置:网站首页>Intelligent operation and maintenance scenario analysis: how to detect abnormal business system status through exception detection
Intelligent operation and maintenance scenario analysis: how to detect abnormal business system status through exception detection
2022-07-22 11:08:00 【Cloud smart aiops community】
Usually , The business system is abnormal , The most direct 、 The most intuitive reflection is the abnormal fluctuation of key business indicators . Take the insurance industry as an example , When the business system is abnormal , The ability of the system to process insurance policies will be significantly reduced , Corresponding to the business indicator description , namely : When there is a problem with the business system ,“ Policy volume ” There will be a drop .
How to judge correctly “ Policy volume ” There is a decline ? The traditional way is to set a fixed threshold , for example : Define under normal circumstances , The number of insurance policies that the system can handle per minute should be 200~600 Between . When the number of insurance policies monitored in real time exceeds the above threshold , That is, the number of insurance policies is considered abnormal . Fixed threshold alarm of traditional monitoring system , It is to generate alarm information by setting a fixed alarm threshold and comparing it with real data .
This logic seems to be OK on the surface , But think about it , Every morning , How many new insurance policies will be submitted to the system ( Suppose the insurance company only accepts domestic business )? obviously , Every morning 10 Point to 12 The number of new insurance policies submitted to the system between points is far more than the number of insurance policies submitted to the system every morning .
And so on , There is also a significant difference in the number of insurance policies processed by the business system on holidays and working days . If we make an in-depth analysis based on this logic , Will find , It is difficult for enterprises to use pre-set rules ( threshold ) To judge whether the policy volume index of the business system is abnormal .
In order to solve the above problems , Cloud wisdom DOCP Platform DOEM Digital operation and maintenance event management products adopt multi algorithm integrated learning mode , And introduce 3 A method of anomaly detection for sequential monitoring indicators : Dynamic baseline 、 Year on year / Month on month and index anomaly detection .
Dynamic limit
Based on historical data , After deep learning with intelligent algorithm , Accurately predict the value of each time point in the future , Take the predicted value as the baseline , And by comparing the deviation between the actual value and the baseline ( Percentage difference ) To monitor and alarm .
Dynamic baseline is applicable to scenarios where a certain data index is known to change periodically and there is no way to give the exact value of each cycle or the data in the cycle changes too much . Take the business scenario of the insurance industry as an example , We study according to the historical insurance policy quantity , Identify the trend and periodic changes of historical data , Predict the changes in the number of insurance policies in the future . At the same time, according to the distribution of historical data , Give the changes of the upper and lower limits in the future . When the index to be tested is higher than the baseline and higher than the upper limit / Below the lower limit , That is, it is judged as abnormal . Monitoring found that the predicted actual value data is frequently less than the predicted data , We effectively detect this anomaly , And trace the root cause of the incident .
Same as / Month on month anomaly detection
It is used to find out whether the change trend of an indicator to be monitored continues to improve or deteriorate . Compare the target monitoring value with the distribution of historical data in the same period and the changes in the same month on month , Judge whether the new data is abnormal according to the value or percentage difference , And judge whether to alarm .
single / Multi index anomaly detection
In order to cope with the differentiated data characteristics of the wrong business model ,DOEM Unsupervised ensemble learning algorithm is used to detect index anomalies , There is no need to manually set a fixed threshold and define the baseline deviation , The system depends on different data characteristics , Choose different algorithms to do targeted detection , And make an overall evaluation of the abnormality , An alarm message is generated after automatically identifying the data that does not meet the expectations .
Cloud wisdom DOEM(Digital Operation Event Management Abbreviation ) Digital operation and maintenance event management products are oriented to technology and management , Focus on events , Realize the global control of the whole life cycle of problem events .DOEM Based on big data technology and machine learning algorithm , Unified access and processing of alarm messages and data indicators from various monitoring systems , Support the filtering of alarm events 、 notice 、 Respond to 、 Management 、 grading 、 Tracking and multidimensional analysis .DOEM The product is based on various algorithms such as dynamic baseline , It can realize the alarm convergence of events 、 Anomaly detection 、 Root cause analysis 、 Intelligent prediction , Help enterprises get through the data island , Unified operation and maintenance standards and management norms , Reduce transactional interference to operation and maintenance , Improve the overall management level of operation and maintenance .
Open source benefits
Cloud intelligence has become an open source data visualization platform FlyFish . By configuring the data model, it provides users with hundreds of visual graphics components , Zero coding can achieve a cool visual large screen that meets your business needs . meanwhile , Flying fish also provides flexible expansion ability , Support component development 、 Customize the configuration of functions and global events , Facing complex demand scenarios can ensure efficient development and delivery .
Click the address link below , Welcome to *FlyFish Like to send Star.
GitHub Address : https://github.com/CloudWise-OpenSource/FlyFish
Gitee Address :https://gitee.com/CloudWise/fly-fish
边栏推荐
- 每日一题C语言9
- 利用西门子低代码实现企业质量管理流程的敏捷性
- 手动封装对象深拷贝方法
- MATLAB中split函数使用
- leetcode 92. Reverse Linked List II(链表逆序II)
- ACM mode when brushing questions
- About BOM update of SAP apo rpmcall specified production order
- What is a video content recommendation engine?
- 美参议院初步通过520亿美元「芯片法案」,她竟乘机「投资炒股」!
- Creation and call of stored procedure based on Oracle Database
猜你喜欢
Seven best ways to overcome procrastination
[cloud native | learn kubernetes from scratch] VII. Resource list and namespace
systemd 管理 redis-exporter linux
Restful URL design specification
Dota2参议院[贪心与队列]
Women's health and health information network dream weaving template (with mobile terminal) [test can be built]
数据分析从0到1----Numpy篇
[unity3d] blood bar (HP)
How can I sync previous online notes after OneNote is reinstalled or upgraded?
Value and technical thinking of vectorization engine for HTAP
随机推荐
Matlab natural spline function (constraining the slope at both ends)
How to choose the appropriate data type for fields in MySQL?
:empty伪类代替js,实现为空时的提示
[10:00 public class]: cloud video conference system privatization practice
手动封装对象深拷贝方法
Robot modeling and 3D simulation based on ROS [physical / mechanical significance]
Is it safe to buy funds on e fund? I want to make a fixed investment in the fund
【云原生 | 从零开始学Kubernetes】七、资源清单与Namespace
2022 audio and video technology vane
unordered_map的使用
"Messy and difficult" work tasks? This gadget will help you get it done easily!
Why does the servlet of POM in the web rely on scope as provided
EN 1504-5混凝土结构保护和修理用产品混凝土喷射—CE认证
杭州动环监控系统供应商,动环监控设备
基于ROS的机器人模型建立及3D仿真【物理/机械意义】
单页面引用记录上一次滑动的位置
【公开课预告】:云视频会议系统私有化实践
MATLAB 自然样条函数(约束两端斜率)
利用西门子低代码实现企业质量管理流程的敏捷性
js 模拟form表单post提交