Responsible AI Series: Part 2

In this mini-series, we are investigating three different model-agnostic machine learning interpretability techniques offered by the MathWorks product suite. These techniques are:

  1. Local Interpretable Model-Agnostic Explanations (LIME)
  2. Partial Dependence and Individual Conditional Expectations Plots (PDP and ICE)
  3. SHAPley Values.

Interpretability of Machine Learning Models using PDP and ICE

Why Interpretable Machine Learning?

As artificial intelligence becomes more prevalent in financial industries, its use presents considerable opportunities but also challenges. AI models have the potential to price options through deep hedging instead of Black-Scholes equations; neural nets can be used to call minute-ahead forex and slippage can be cut in stock trading to improve execution by means of trading algorithms. But how sure are we that the model is going to perform as expected while keeping in mind the trade-off between model simplicity, interpretability and predictive power (Figure 1)?

Figure 1: Trade-off between model simplicity and predictive power

Unexplainable models bear risks to those who use them. How is a credit scoring AI model supposed to be probed when it discriminates against certain demographics? What happens when the model appears to be ‘sexist’?

There is an explicit requirement for the black box nature of AI models to be explainable, both from responsible business practitioners and from regulatory bodies. There is no industry standard on how to achieve this, but there have been some significant developments in improving the transparency and explainability of models. In this article, we will investigate two global level interpretability techniques. The global levels refers to the investigation of the model over an entire training or test dataset.

Partial Dependence Plots and Individual Conditional Expectation Plots

Two global model explanations, along with their MATLAB implementations will be discussed: the individual conditional expectation (ICE) plot and partial dependence plot (PDP). These two methods examine the effect of one or two predictors on the overall prediction by averaging the output of the model over all the possible feature values.

To explain how these plots are used, lets imagine we are developing a model that predicts the probability of a customer defaulting on a loan. Further, we are looking to understand how a variable, for example customer age, influences the prediction of the model. The PDP demonstrates the marginal relationship between separate predictor variables and the response variable. An example of a PDP is shown in Figure 2 (left). The rationale is that the customer age is varied while the other variables are kept constant to see the marginal effect of age on prediction. From Figure 2 (left), we can see that the age of the customer seems to be more impactful on the results of the model as the age of the loan applicant increases. In addition to viewing a single predictor, the PDP allows an investigator to interrogate two predictors which can be seen in the 3D plot in Figure 2 (right). We can now see the effect that the two predictors, customer income and age, have on the model and how they influence the score as they both increase or decrease.

The ICE plot, Figure 3, offers a similar concept, but with a more granular view. Instead of the average for customer age, it shows separate lines for each datapoint (the grey lines) superimposed with the PDP (the red line). Therefore, its use case is for determining if the impacts on each individual datapoint corroborates with the average marginal effect of the PDP.

Talk to us if you are interested in how you can apply this technique in your machine learning projects and create Responsible AI models. The following resources may prove useful:

What’s to come?

In the next part of the series, we will look at how you can interpret your data on a local level using SHAPley values.

What Can I Do Next?

Follow us

References