Anomaly Detection in Supervised ML
The boom of analytics across industries beyond technology has led to a love affair with machine learning – and in particular with what is known as “supervised” machine learning. Supervised machine learning is the heart and soul of most predictive analytics applications. These algorithms rely on labeled training data to identify patterns and relationships in the data that have been correlated with a measured outcome in the past and apply those patterns in a forward-looking manner to predict that outcome in the future. Think of a marketing application that uses data on past customers and their responses to a previous marketing offer, and builds a model that can use current customer data to predict their responses to a similar future offer. Many of the most widely known classes of machine learning models are in fact supervised learning models: linear and logistic regression, classification trees and random forests, naïve Bayes classifiers, etc. Even deep learning models – structurally complex artificial neural networks – grew primarily as supervised learning tools, though recent applications have quickly broadened to include other objectives.
In many cases, the biggest challenge in building a predictive modeling solution around supervised learning is not picking the right model or tuning it to the particular use case, but collecting a big enough and broad enough labeled dataset in the first place. This problem is compounded when the range of outcomes to be predicted is broader than what could fit into a single modeling objective. Two examples illustrate this well.
Two Sample Anomaly Detection in ML Use Cases
Predictive Maintenance: Any of our customers operating an expensive piece of machinery – whether a jet engine, a medical imaging tool, or a component of a manufacturing line – has a strong interest in minimizing machine downtime and efficiently scheduling and performing maintenance. Similarly, if your business sells appliances such as refrigerators or printers, you have an opportunity to build a strategic advantage over your competitors, if you can help your customers avoid an unexpected breakdown that results in a week’s worth of spoiled food or a sprint to the print shop right before that meeting with the board. With the rapid introduction of sensors tracking the behavior of machines large and small – part of the growth of the “internet of things” (IoT), to use the latest term-of-choice – predictive maintenance solutions can use past data to identify patterns that led up to machine degradation or failure and apply those insights to real-time data streams to predict problems that are likely to surface in the near future. This allows decision makers to replace parts before they fail, perform proactive maintenance during scheduled downtimes, or alert customers that they should replace that compressor before their chocolate chip ice cream turns to soup.
The biggest challenge to deploying effective predictive maintenance models: the nearly infinite range of possible failure patterns. Predictive models target particular failure types and degradation patterns. They require a substantial amount of labeled data that matches sensor streams to the timing of an identified failure type. That can be a difficult and time-consuming data set to generate even for the most frequent failure types, which might account for only 15% of your unplanned downtime. What can you do to start addressing that other 85%?
Predictive Medicine: The analytics revolution in healthcare is in full swing. One key area of focus is in using the flood of data now available from smartphones and smartwatches, other wearables such as continuous blood pressure or heartrate monitors, connected monitoring tools such as mobile glucose testers, etc., (there’s that IoT again…) to diagnose the precursors to life-threatening events. The initial damage from a heart attack can occur within a few short minutes, but the medical care received in the 1-2 hours following the event has a significant positive effect on a patient’s prognosis. Imagine if a predictive model could identify the patterns that indicate an oncoming heart attack 30 minutes before it hits so that the patient could be in an ambulance on the way to the hospital before it hits. Even if you could collect a labeled data set of streaming vitals leading up to heart attacks or other catastrophic health events, two patients may be starting from such different baselines that it is difficult to build a single model that can predict these events across a diverse set of patients. How can we generate meaningful alerts for a wider range of the population?
Anomaly Detection Development
Anomaly detection can be deployed alongside supervised machine learning models to fill an important gap in both of these use cases. Anomaly detection automates the process of determining whether the data that is currently being observed differs in a statistically meaningful and potentially operationally meaningful sense from typical data observed historically. This goes beyond simple thresholding of data. Anomaly detection models can look across multiple sensor streams to identify multi-dimensional patterns over time that are not typically seen.
Rather than needing to label data for rarely observed machine failures or medical events, customers need only identify much more widely available baseline data for anomaly detection models to learn from. While the supervised learning models will be able to identify patterns in the data streams that indicate a high likelihood of a known machine failure type or catastrophic medical event, anomaly detection models can alert human experts to patterns that require closer inspection. This may trigger additional testing during the next scheduled downturn for a piece of equipment or may automatically send a patient’s data to a doctor on call to determine if immediate intervention is required. In the most efficient integrations with supervised learning models, tagged anomalies can be fed into a case management process, and in cases where follow-up identifies the anomaly as a true maintenance or medical event, the data is labeled appropriately and fed into the supervised learning framework to improve and expand the predictive models over time. A neighborhood search of past cases can automatically surface past cases from anomalies with similar data patterns to a newly tagged anomaly to help the human decision makers more efficiently investigate the case.
As an added benefit, because they learn efficiently from easily acquired data, anomaly detection models can quickly customize to individuals – whether patients or pieces of industrial equipment. Each individual is likely to have its own normal data patterns. Anomalies are much more likely to be relevant if they are based on individualized baseline data. Similarly, customized anomaly detection for different operational modes can result in a higher rate of relevant anomalies.
Supervised learning is still the gold standard of predictive maintenance and predictive medicine, but the challenge of collecting and labeling the necessary data and high relative level of effort to build supervised learning models limits means that it is too expensive – or impossible – to build a truly robust solution based entirely around supervised learning. Anomaly detection is an essential complement to supervised learning in these domains.