TY - CHAP
T1 - CalCleaveMKL
T2 - a Tool for Calpain Cleavage Prediction
AU - duVerle, David A.
AU - Mamitsuka, Hiroshi
PY - 2019/1/1
Y1 - 2019/1/1
N2 - Calpain, an intracellular Ca2+-dependent cysteine protease, is known to play a role in a wide range of metabolic pathways through limited proteolysis of its substrates. However, only a limited number of these substrates are currently known, with the exact mechanism of substrate recognition and cleavage by calpain still largely unknown. Current sequencing technologies have made it possible to compile large amounts of cleavage data and brought greater understanding of the underlying protein interactions. However, the practical impossibility of exhaustively retrieving substrate sequences through experimentation alone has created the need for efficient computational prediction methods. Such methods must be able to quickly mark substrate candidates and putative cleavage sites for further analysis. While many methods exist for both calpain and other types of proteolytic actions, the expected reliability of these methods depends heavily on the type and complexity of proteolytic action, as well as the availability of well-labeled experimental datasets, which both vary greatly across enzyme families.This chapter introduces CalCleaveMKL: a tool for calpain cleavage prediction based on multiple kernel learning, an extension to the classic support vector machine framework that is able to train complex models based on rich, heterogeneous feature sets, leading to significantly improved prediction quality. Along with its improved accuracy, the method used by CalCleaveMKL provided numerous insights on the respective importance of sequence-related features, such as solvent accessibility and secondary structure. It notably demonstrated there existed significant specificity differences across calpain subtypes, despite previous assumption to the contrary. An online implementation of this prediction tool is available at http://calpain.org .
AB - Calpain, an intracellular Ca2+-dependent cysteine protease, is known to play a role in a wide range of metabolic pathways through limited proteolysis of its substrates. However, only a limited number of these substrates are currently known, with the exact mechanism of substrate recognition and cleavage by calpain still largely unknown. Current sequencing technologies have made it possible to compile large amounts of cleavage data and brought greater understanding of the underlying protein interactions. However, the practical impossibility of exhaustively retrieving substrate sequences through experimentation alone has created the need for efficient computational prediction methods. Such methods must be able to quickly mark substrate candidates and putative cleavage sites for further analysis. While many methods exist for both calpain and other types of proteolytic actions, the expected reliability of these methods depends heavily on the type and complexity of proteolytic action, as well as the availability of well-labeled experimental datasets, which both vary greatly across enzyme families.This chapter introduces CalCleaveMKL: a tool for calpain cleavage prediction based on multiple kernel learning, an extension to the classic support vector machine framework that is able to train complex models based on rich, heterogeneous feature sets, leading to significantly improved prediction quality. Along with its improved accuracy, the method used by CalCleaveMKL provided numerous insights on the respective importance of sequence-related features, such as solvent accessibility and secondary structure. It notably demonstrated there existed significant specificity differences across calpain subtypes, despite previous assumption to the contrary. An online implementation of this prediction tool is available at http://calpain.org .
KW - Calpain
KW - Cleavage prediction
KW - Multiple kernel learning
KW - Support vector machines
UR - http://www.scopus.com/inward/record.url?scp=85059926403&partnerID=8YFLogxK
U2 - 10.1007/978-1-4939-8988-1_11
DO - 10.1007/978-1-4939-8988-1_11
M3 - Chapter
C2 - 30617801
AN - SCOPUS:85059926403
SN - 978-1-4939-8987-4
T3 - Methods in molecular biology
SP - 121
EP - 147
BT - Calpain
PB - Springer
ER -