TY - JOUR
T1 - Hydrogen Adsorption on Defective Nitrogen-Doped Carbon Nanotubes Explained via Machine Learning Augmented DFT Calculations and Game-Theoretic Feature Attributions
AU - Kronberg, Rasmus
AU - Lappalainen, Heikki
AU - Laasonen, Kari
PY - 2021/7/29
Y1 - 2021/7/29
N2 - Complex machine learning (ML) models applied within computational chemistry and materials science tend to be seen as black boxes, yielding property predictions given some input features. While the purpose of ML methods is often to circumvent computationally expensive first-principles calculations, the fact that the inner workings of the models are not understood conceals chemical insight and knowledge regarding the underlying data and physical correlations within it. Knowing what a model is learning from the data and how outputs are formed is also useful in facilitating the justification and wider adoption of ML solutions. Here, we present an important contribution in this direction by exploring and explaining the hydrogen adsorption properties of defective nitrogen-doped carbon nanotubes (NCNTs) through density functional theory simulations and machine learning-based data analysis. As the main highlight, we demonstrate the application of a recent game-theoretic approach to deconvolute and interrogate the trained ML models, revealing how various structural, chemical, and electronic features contribute toward the hydrogen affinities of roughly 6500 different NCNT adsorption sites. The employed method of Shapley additive explanations (SHAP) attributes locally accurate importances to the investigated features, unraveling high spin polarization, narrow highest occupied molecular orbital–lowest unoccupied molecular orbital (HOMO–LUMO) gap, small dopant–adsorption site separation, and diverse angle and coordination effects as particularly impactful for increasing hydrogen adsorption strengths. The SHAP method is shown capable of promoting a deep understanding of complex feature–activity relationships, facilitating research efforts such as rational catalyst design for energy conversion applications.
AB - Complex machine learning (ML) models applied within computational chemistry and materials science tend to be seen as black boxes, yielding property predictions given some input features. While the purpose of ML methods is often to circumvent computationally expensive first-principles calculations, the fact that the inner workings of the models are not understood conceals chemical insight and knowledge regarding the underlying data and physical correlations within it. Knowing what a model is learning from the data and how outputs are formed is also useful in facilitating the justification and wider adoption of ML solutions. Here, we present an important contribution in this direction by exploring and explaining the hydrogen adsorption properties of defective nitrogen-doped carbon nanotubes (NCNTs) through density functional theory simulations and machine learning-based data analysis. As the main highlight, we demonstrate the application of a recent game-theoretic approach to deconvolute and interrogate the trained ML models, revealing how various structural, chemical, and electronic features contribute toward the hydrogen affinities of roughly 6500 different NCNT adsorption sites. The employed method of Shapley additive explanations (SHAP) attributes locally accurate importances to the investigated features, unraveling high spin polarization, narrow highest occupied molecular orbital–lowest unoccupied molecular orbital (HOMO–LUMO) gap, small dopant–adsorption site separation, and diverse angle and coordination effects as particularly impactful for increasing hydrogen adsorption strengths. The SHAP method is shown capable of promoting a deep understanding of complex feature–activity relationships, facilitating research efforts such as rational catalyst design for energy conversion applications.
UR - https://github.com/rkronberg/ncnt-random-forest/
UR - http://www.scopus.com/inward/record.url?scp=85111515019&partnerID=8YFLogxK
U2 - 10.1021/acs.jpcc.1c03858
DO - 10.1021/acs.jpcc.1c03858
M3 - Article
VL - 125
SP - 15918
EP - 15933
JO - Journal of Physical Chemistry C
JF - Journal of Physical Chemistry C
SN - 1932-7447
IS - 29
ER -