Explainable AI: Cooking Up Deep Learning Models Like Gourmet Recipes

Ghaith Sankari
Oct 27, 2024
10 min read

Since I was 14, food has been my love language. Growing up in Aleppo, Syria, I learned that food isn’t just about flavors; it’s a symphony of ingredients, methods, and that elusive touch of artistry. In cooking (and data modeling), the recipe is everything, and things get interesting only when the outcome is… unexpected. You don’t ask for a recipe if the dish tastes exactly like you’d expect, right? The same goes for deep learning models: if they do something extraordinary—or don’t do what they’re supposed to—that’s when we ask, What went wrong (or right)?

	Cooking Recipe	Deep Learning Modelling
Raw Materials	Dish ingredients: simple and complex Materials	Dataset, features that shape the output
Data Generation	Feature selection: choose ingredients.Feature engineering: cut, clean … spicing ..etcdata augmenting: in many ways .. most of the time include “fat” for our cuisine.resampling … it is important for some ingredients .. but itis bit harder than resampling in deeplearning modelling	Features selection, features engineering, data augmenting,resampling ..
Main Actions	always there is a person who tell you what to do	supervised learning [Labels - forward propagation - back propagation]
Negative feedback Concerns	dish taste is not as expected	low accuracy
Positive feedback Concerns	dish is extra-ordinary good	over-fitting

I could spend all night dissecting that last table, but let’s step out of the kitchen for a moment and dive deeper into Explainable AI. Here’s a quick hint: the challenge with ready-made meals is that we just accept them without questioning the ingredients or the cooking methods. This mirrors how we sometimes approach pre-built neural networks, like ResNet, Inception, or BERT.

Using these models can feel like being someone with specific dietary restrictions—whether for health, personal, or religious reasons—dining out in the U.S. You have to ask what’s in the food and how it’s prepared because there are particular requirements and limitations you’re working with.

So, as we step out of the kitchen, remember this: companies that develop data science tools and AI frameworks are like chefs, the clients are the eager diners, and the “waiter” explaining the dish? Well, that’s our data scientist! Basics

Now that we’ve covered who needs Explainable AI and why, let’s dig into some foundational methods for explaining models and look at how some models are already inherently explainable. Before we do, here’s a bit of intuition to help clarify why this article is focused on deep learning.

Most of the manual or algorithmic techniques for dimensionality reduction—like PCA, ICA, NMF, or SVD, and even feature selection through positive or negative correlations—play a key role in helping us interpret how a model is structured and functions. In traditional machine learning, these methods have helped make sense of models by reducing complexity. However, with the arrival of deep learning, we’re now building robust models with high-dimensional datasets. Thanks to large datasets, deep learning allows us to sidestep the “curse of dimensionality,” creating models that are powerful but complex.

This is where Explainable AI comes in, giving us a way to decode this complexity and understand what’s happening under the hood. And as we’ll see, Explainable AI offers some positive side effects that we’ll discuss further along in this article.

Explainable AI for Tree-Based Models

When we look at models like binary trees or random forests, they naturally lend themselves to explanation. These models can be represented as nested "if-then" instructions, which—though sometimes extensive—allow us to clearly identify which features are driving each decision.

Because of this, we can, for now, define Explainable AI (XAI) as the process of understanding which features play the most significant roles in a model's output. This makes tree-based models a great starting point for exploring explainability.

Since tree-based models can be represented as a series of nested "if-then" statements (sometimes quite extensive), we can easily see which features play a major role in the decisions made. Given this structure, we can, at least for now, define Explainable AI (XAI) as understanding which features most significantly influence the model's output. Methods for Explainable AI for deep Neural Networks:

In the chart, we can observe a relationship between model accuracy and interpretability.

The Structure of Neural Networks

To break down the architecture of a neural network, we have:

Input Layer: Contains nodes representing the input data.
Hidden Layers: These consist of multiple nodes and may include structures like convolutional layers, residual blocks, or fully connected layers.
Output Layer: Represents the final output.

Ultimately, this setup is a complex mathematical function.

This basic layout helps us understand that if we want to explain or interpret the behavior of a deep neural network, we need to work with one or more of these components. For example, one interpretability method might be to structure the output to provide more task-specific details—yet, as we get more technical, it takes us back into the world of ML engineering. So, this approach doesn’t exactly appeal to the “restaurant-goer” audience looking for simple explanations.

Interpretability vs. Explainability

Before categorizing different explanation methods, it’s useful to clarify:

Interpretability: A human can understand the outcomes and, to some extent, the inner workings of the model.
Explainability: This involves using a tool, or “explainer,” to access comprehensible information.

The main difference? Explainability needs a dedicated tool to make sense of the model, which is especially important when we think of transparency. Tree-based models are easy to interpret due to their transparent nature. In contrast, deep neural networks are often labeled “black boxes,” as they offer minimal visibility into their workings.

Local vs. Global Explanations

Since XAI is about understanding how features influence the output, the next question is: Which features should we study?

This leads us to two types of explanations:

Global Explanations: These cover the entire solution hyperspace (𝓗), offering a comprehensive but complex understanding.
Local Explanations: These focus on specific sub-regions within 𝓗, making analysis simpler but more limited in scope.

One of the key goals of XAI is to determine which features are most influential in producing the output, known as feature importance.

High-Level Categories of Explanation Methods

We can broadly categorize explanation methods as follows:

Perturbation-Based Explainers: These involve modifying inputs (e.g., adding noise or testing across a data range) and assessing the impact on the model to determine feature importance.
Function-Based Explainers: Here, we rely on mathematical functions to create explainers, using techniques like Taylor expansion or Gradient x Input methods.
Structure-Based Explainers: This approach, similar to function-based explainers, examines the neural network’s structure post-training (i.e., after weights and biases have been adjusted). Layer-wise relevance propagation (LRP) is a popular example.
Sampling-Based Explainers: An extension of perturbation-based methods, these use permutation rather than perturbation. SHAP and LIME are standout examples.

Let’s dive deeper into each of these explaining methods to understand their nuances and applications.

Perturbation Based Explainers The concept is straightforward: we assess feature relevance by observing how the model responds when specific features are removed or altered.

Imagine trying to identify which part of an image leads the model to classify it as a "castle." One common technique is to apply a noisy mask to portions of the image, moving the mask around to see which areas impact the model’s output. By analyzing how different parts influence the result, we can pinpoint the most influential features.

However, this method is not as widely used today due to several limitations:

Slow Processing: Repeatedly applying perturbations to get a clear picture of important features is time-consuming.
Potential Artifacts: Depending on the dataset and how the perturbations are created, this method may introduce artifacts, distorting the results.
Local Assumptions: This technique often assumes that the importance of features is local, which may not always hold across the entire dataset.

Function based Explainers

As mentioned earlier, our goal is to create explainers based on mathematical functions. Since neural networks are essentially multidimensional functions, we can use their components—constants and variables—to build these explainers. Common methods here include Taylor expansion and Gradient x Input, which help us understand how different elements contribute to the model’s output.

Taylor Expansion:

In mathematics, the Taylor series represents a function as an infinite sum of terms based on its derivatives at a specific point. For most common functions, the function and its Taylor series are nearly identical in the vicinity of this point. This concept, introduced by Brook Taylor in 1715, is called a Maclaurin series if the derivatives are taken around zero—a variation named after Colin Maclaurin, who popularized it in the 18th century.

Now, how can we use this to explain deep neural networks (DNNs)?

The idea behind using Taylor expansion is to redistribute relevance from the model's output back to its input features. It’s a fast and mathematically sound approach, but applying it is challenging because it requires identifying a meaningful root point—which isn’t easy to determine.

Why does this method provide a clear explanation of a model? And why is finding a suitable root point so difficult? We won’t fully dive into these questions here, but let’s go back to the kitchen for an analogy: imagine trying to capture the complete flavor of a dish in a single bite. When there’s a long list of ingredients, each with its own preparation, it’s impossible to pinpoint the essence with just one taste.

To make it more clear, The challenge only grows when we think about high-dimensional data, like images. Here, the root point would be another image of the same size—a daunting task to find. This complexity is why Taylor expansion isn’t always straightforward in practice.

Gradient x Input

is one of the simplest and quickest explanation methods to apply. It doesn’t require a root point; instead, it calculates contribution values by multiplying the gradient with the input itself. However, this method bypasses the model’s trained weights, often resulting in noisy explanations that lack precision, as illustrated in the example.

Structured based Explainers

We expand in explaining taylor expansion: its importance to understand structure based explainers, especially that we are going to show LRP.

We can simplify the explanation process for deep neural networks by breaking down the task into smaller, manageable parts. Full-function explanations for DNNs are complex, so instead, we focus on the network’s structure. Since a DNN consists of layers, each with multiple neurons, we can look at each neuron individually to find its contribution. By examining each neuron separately, we can locate root points more easily for each one. Then, by combining these smaller explanations, we get a comprehensive view of model performance.

This approach works by back-propagating the output through the layers toward the input, generating a layered explanation. For tasks like image classification, this results in a set of values that can be visualized as a heatmap, highlighting which parts of the input image most strongly influenced the model’s decision.

This method not only highlights feature importance through heatmap values but also provides insights into which neurons are activated and how much each one contributes to the model’s output. Known as Layer-wise Relevance Propagation (LRP), this technique expands on the Taylor series or Deep Taylor Decomposition. Many tools and libraries have been built around LRP, and most are generalized to work with various DNN layers, including convolutional and dense layers. However, LRP is not a model-agnostic method, as it relies on the model’s structure. In contrast, model-agnostic explainers, like those in sampling-based methods, focus on feature importance and output without considering the underlying structure.

Sampling Based Explainers We cannot ignore one of the most important explaining tools that falls in this category, which are SHAP & LIME, since we are going to explain how SHAP works, we will keep LIME for future blogs.

SHAP

It is a tool that proves good results in finding the feature importance for DNN, and we mean by feature importance: the contribution of each feature in output. depending on Shapley Values, but what is Shapley Values?

Let us go deeper in this concept...

The Shapley value is a concept from cooperative game theory, introduced by Lloyd Shapley in 1951, for which he won the Nobel Prize in Economics in 2012. It provides a way to fairly distribute a total surplus (or “payoff”) generated by a coalition of players based on each player’s contribution. The Shapley value is valued for its desirable fairness properties.

To explain this with a story: imagine Ann, Bob, and Cindy working together to hammer a 38-inch “error” wood log into the ground. Afterward, they go to a bar, where I, a mathematician, ask a curious question: “How much did each of you contribute to this effort?”

To figure this out, I consider all possible orders of contributions (permutations). When the sequence is Ann, Bob, Cindy, their contributions measure 2, 32, and 4 inches, respectively. By calculating each player’s marginal contribution across all permutations, we can determine a fair distribution of credit—this is the essence of the Shapley value.

The table shows that when Ann and Bob (or Bob and Ann) work together, they achieve 34 inches, so Cindy’s marginal contribution to this group effort is 4 inches. By averaging each person’s marginal contribution across all possible orders, we get each individual’s fair share: Ann contributes 2 inches, Bob 32 inches, and Cindy 4 inches. This averaging process is how we calculate the Shapley value—the average of contributions across all permutations.

In machine learning, this concept translates to model interpretation. Imagine the wood log as the “error” log (or loss function): it represents the difference between the model’s actual and predicted values. The “hammers” are the predictors working to minimize this error. The Shapley values then measure each predictor’s contribution to reducing the error.

Since this blog offers a broad overview of XAI tools, we won’t dive into the math just yet. Instead, let’s highlight the key benefits of using SHAP for model interpretation.

global interpretability

the collective SHAP values can show how much each predictor contributes, either positively or negatively, to the target variable. This is like the variable importance plot but it is able to show the positive or negative relationship for each variable with the target.

local interpretability

Each observation gets its own set of SHAP values . This greatly increases its transparency. We can explain why a case receives its prediction and the contributions of the predictors. Traditional variable importance algorithms only show the results across the entire population but not on each individual case. The local interpretability enables us to pinpoint and contrast the impacts of the factors.

Tree based

Third, the SHAP values can be calculated for any tree-based model, while other methods use linear regression or logistic regression models as the surrogate models.

Dimensionality Reduction & feature engineering VS XAI

We’ve noted that both manual and algorithmic dimensionality reduction highlight feature importance, similar to what Explainable AI (XAI) does. But what’s the difference?

Dimensionality Reduction is typically used by model builders during the model-building phase to identify and retain only the most essential features, refining the model's structure. XAI, on the other hand, operates primarily in the testing phase and some times in production environment, aiming to increase our trust in the model by explaining its decisions. Beyond this, XAI often provides deeper insights into the structure and feature importance within deep neural networks (DNNs). In some cases, XAI findings may even prompt a redesign or retraining of the model or adjustments to the data generation process.

Summary

In this blog we explain the importance of XAI and tried to highlight some explaining method and libraries that can give us great understanding about the targeted model without goes deeply in coding samples and without explaining the relationship between interpretability and causality, we tried to create high level categorization for explaining method and tried to explain some of them in details.