Model Explainers Require Thoughtful Interpretation
In this text, I compare model explainability techniques for feature interactions. In a surprising twist, two commonly used tools, SHAP and ALE, produce opposing results.
Probably, I mustn’t have been surprised. In spite of everything, explainability tools measure specific responses in distinct ways. Interpretation requires understanding test methodologies, data characteristics, and problem context. Simply because something is named an explainer doesn’t mean it generates an explanation, in case you define a proof as a human understanding how a model works.
This post focuses on explainability techniques for feature interactions. I take advantage of a typical project dataset derived from real loans [1], and a typical mode type (a boosted tree model). Even on this on a regular basis situation, explanations require thoughtful interpretation.
If methodology details are missed, explainability tools can impede understanding and even undermine efforts to make sure model fairness.
Below, I show disparate SHAP and ALE curves and exhibit that the disagreement between the techniques arise from differences within the measured responses and have perturbations performed by the tests. But first, I’ll introduce some concepts.
Feature interactions occur when two variables act in concert, leading to an effect that’s different from the sum of their individual contributions. For instance, the impact of a poor night’s sleep on a test rating could be greater the subsequent day than every week later. On this case, a feature representing time would interact with, or modify, a sleep quality feature.
In a linear model, an interaction is expressed because the product of two features. Nonlinear machine learning models typically contain quite a few interactions. Actually, interactions are fundamental to the logic of advanced machine learning models, yet many common explainability techniques concentrate on contributions of isolated features. Methods for examining interactions include 2-way ALE plots, Friedman’s H, partial dependence plots, and SHAP interaction values [2]. This blog explores…