*This is Part 1 of a 2 part series examining human acception of machine learning model outputs.
posted by Mosaic Data Science
‘Big Data’ and ‘Data Science’ are the new buzzwords creating a significant amount of excitement in the world of business today. We now experience the results of machine learning models on a frequent basis through online interaction with news sites that learn our interests, retail websites that provide automated offers that are customized to our buying habits, and credit card fraud detection that warns us when a transaction occurs that is outside of our normal purchasing pattern. Many such applications of machine learning can be performed using an automated approach where human interpretation of the recommendations is not necessary. However, in many other applications, there would be great benefit from the ability of the machine learning model to provide an explanation of its output recommendation or other result.
Although some machine learning models can provide limited insight into and explanation of the model outputs, most machine learning model output is highly obfuscated and opaque. In the realm of many decision support tools for military and other safety- or life-critical applications, it is necessary and appropriate for humans to be involved in decisions using the recommendations and guidance of computer automation and information systems. However, the opacity can lead users of the technology to doubt the reliability of the information or recommendation that is provided. This lack of understanding of the technology can result in distrust, and to eventual failure of the technology to receive acceptance and use in its intended operational domain. Even if the technology does receive acceptance and operational use, a machine learning-based information system that can explain its output and recommendations may allow more efficient and effective use of the technology.
In this two-part blog series we will study two novel techniques to generate explanations of machine learning model results for use in advanced automation-human interaction. The first technique is “Explainable Principal Components Analysis,” and the second is our “Gray-Box Decision Characterization” approach.
The Explainable Principal Components Analysis technique and Gray-Box Decision Characterization approach apply across a broad range of different types of machine learning models, and to essentially any application domain of machine learning. We will utilize two large datasets from two distinct application domains to demonstrate the generic applicability of our proposed techniques. The two application domains are resume matching to job requirements, and air traffic flow management (ATFM), both of which involve resource allocation.
For Part 1 we will focus on the use case of Air Traffic Flow Management (ATFM).
What is Explainable Principal Components Analysis (EPCA)?
Principal Components Analysis (PCA) is a technique that is used extensively within machine learning model development for dimensionality reduction by many data mining companies. From an information-theoretic perspective, regular PCA aggregates information contained in a high-dimensional space into a form that can represent an arbitrarily large portion of the information in the data through a lower-dimension vector representation. While this aggregation process identifies the orthogonal dimensions in the data over which the greatest explanation of the variance can be achieved, the “explanation” of the variance in PCA is maximized from a statistical perspective, but not from the perspective of understandability by a human. In fact, the basis vectors created by PCA are one of the primary sources of opacity in many practical machine learning applications. A formulation of a variant of PCA – which we have already referenced as Explainable Principal Components Analysis (EPCA)- computes basis vectors of the problem space with understandability as a primary objective.
In Figure 1, which shows aircraft arrival tracks for flights entering the Atlanta terminal area from the northwest and landing on Runway 8L at Atlanta Hartsfield International Airport, some modes of variation are clearly evident through visual analysis of the scatter plot. The points of the scatter plot show individual surveillance positions for 246 flights on a single day. The legend indicates the altitude (in 100s of feet) associated with each color. The figure also shows two primary modes of variation of the data (one in yellow lines and one in black lines) that are clear and understandable to human observation. A third mode of variation is not marked, but can be observed as the parallel flight pattern to the south of (below in the figure) the primary flight patterns.
In Figure 2 we show the results of regular PCA on this dataset. The three plots show the first three eigenvectors of the cross-correlation matrix, which is the mathematical approach used in PCA to determine the principal components. Note that the three modes of variation determined by PCA do not closely resemble the three primary
Figure 2. Three Primary Modes of Variation from Regular PCA
modes that can be identified through visual inspection and understood by a human. Although not apparent in Figure 2, additional eigenvectors determined through regular PCA intermix changes through the entire length of the flight pattern, whereas a human observer would be more likely to focus on one segment of the flight pattern at a time.
What is Gray-Box Decision Characterization (GBDC)?
As implied by the name, Gray-Box Decision Characterization (GBDC) uses some knowledge of the inner-workings of the machine-learning approach, but does not make changes to the machine learning algorithm itself. Thus, the approach lies in between a black-box and a white-box approach. It is important to note that we refer to this approach as ‘decision’ characterization, not ‘model’ characterization. The objective of this technique is to provide an explanation for a single specific output of the machine learning model (at a time), not to provide an explainable characterization of the entire machine learning model’s behavior.
The GBDC approach utilizes the results of the EPCA algorithm (or regular PCA if sufficiently explainable) to create an orthogonal basis for sensitivity analysis of the output of the machine learning model around the input data vector for a single decision output. Thus, the portion of the problem space that must be known to the GBDC approach is the input feature representation, as well as access to a large set of training data samples.
A black-box model characterization approach would suffer from the lack of framework for the external tests to be conducted on the machine learning model to derive the characterization. Two primary problems exist with the lack of a clear framework:
The GBDC approach uses the basis of the input feature space as the framework for sensitivity analysis and explanation of the response of the model.
Even though there have been significant advances in machine learning and artificial intelligence, thanks to leading analytics consulting companies, the state of the art in explainability of machine learning remains at a level of infancy. Computer automation systems are able to provide support for rigorously defined situations in which the desired outcomes can be clearly specified and tested, and the performance against those outcomes can be assured. However, when an automation system of today’s capability level is placed in an off-nominal situation, its behavior can become unpredictable and unreliable.
Tragic situations related to such circumstances have been observed in the crash of Asiana flight 214 at San Francisco airport, and the crash of Air France 447 en route over the ocean bound for Paris from Brazil. In these situations, the human operators either did not understand what the computer automation was doing, or the computer automation deactivated itself because it was not able to handle a significantly off-nominal situation. Both cases provide clear demonstration of the need to improve the level of capability of advanced automation/human interaction for safety-critical decision support and automation.
Improvements in advanced automation/human interaction through explainable machine learning can improve performance outcomes of complex operations in military, medical, transportation, financial, emergency response, and many other domains. The EPCA and GBDC techniques have the potential to achieve a quantum step in the acceptance and performance of advanced decision support and resource allocation systems, and should be adopted by all data science consultants.
How does this affect my business?
If you have built a machine learning model and are not sure how much the model is being used, you can implement these two techniques to measure the outputs of your model. Implementing an advanced machine learning model is almost useless if you cannot have decision makers adopt the recommendations being produced.
Mosaic, a leading data science consulting company, can bring these capabilities to your organization, Contact us Here and mention this blog post!