Machine Learning Models
Machine learning methods allow exploiting the information content in protein datasets. We introduced a procedure for the general-purpose numerical codification of polypeptides. With this, we developed a support vector machine model (PPI-Detect), which allows predicting whether two proteins will interact or not. PPI-Detect outperforms state of the art sequence-based predictors of PPI. Using PPI-Detect, we designed a peptide which biological activity was then experimentally established. PPI-Detect is freely available at https://ppi-detect.zmb.uni-due.de
We also introduced ProtDCal-Suite, a web server comprising a set of tools for studying proteins. The main module is a software named ProtDCal, which encodes the structural information of proteins in machine-learning-friendly vectors. Secondary modules are protein analysis tools developed with ProtDCal’s descriptors: PPI-Detect, for predicting the interaction likelihood of protein-protein and protein-peptide pairs; Enzyme Identifier, for identifying enzymes from amino acid sequences or 3D structures and Pred-Nglyco, for predicting N-glycosylation sites. ProtDCal-Suite is freely available at https://protdcal.zmb.uni-due.de.