Explainable artificial intelligence (XAI) has shed light on enormous applications by clarifying why neural models make specific decisions. However, it remains challenging to measure how sensitive XAI solutions are to the explanations of neural models. Although different evaluation metrics have been proposed to measure sensitivity, the main focus has been on the visual and textual data. There is insufficient attention devoted to the sensitivity metrics tailored for time series data. In this paper, we formulate several metrics, including max short-term sensitivity (MSS), max long-term sensitivity (MLS), average short-term sensitivity (ASS) and average long-term sensitivity (ALS), that target the sensitivity of XAI models with respect to the generated and real time series. Our hypothesis is that for close series with the same labels, we obtain similar explanations. We evaluate three XAI models, LIME, integrated gradient (IG), and SmoothGrad (SG), on CN-Waterfall, a deep convolutional network. This network is a highly accurate time series classifier in affect computing. Our experiments rely on data-, metric- and XAI hyperparameter- related settings on the WESAD and MAHNOB-HCI datasets. The results reveal that (i) IG and LIME provide a lower sensitivity scale than SG in all the metrics and settings, potentially due to the lower scale of important scores generated by IG and LIME, (ii) the XAI models show higher sensitivities for a smaller window of data, (iii) the sensitivities of XAI models fluctuate when the network parameters and data properties change, and (iv) the XAI models provide unstable sensitivities under different settings of hyperparameters.
- deep convolutional neural network
- Explainable AI
- time series data