Signal peptide and cleavage site predictions are very important fields in bioinformatics because of its contributions in modern cell biological research, molecular mechanisms of diseases, and drug discoveries. In this paper, we present the results in signal peptide and cleavage site predictions using the weight matrix approach utilizing genetic algorithm (GA)-optimized position weight matrix (PWM) profiles each for eukaryotic, gram-negative and gram-positive prokaryotic organisms. The consistency tests yielded overall performance ratings of roughly 97% for signal peptide prediction while approximately 77% for cleavage site prediction at position 0. Cross-validation results showed that the overall performances of using the GA optimized profile matrices in predicting the presence of signal peptides were as accurate as around 95%. However, for cleavage site prediction, the three optimized profile matrices produced overall accuracy of about 72%-74% in predicting the actual cleavage site location. For protein sequences belonging to the prokaryote organism that are not labeled as gram-negative or gram-positive, predicting for the correct cleavage site location by the GA-optimized PWM profile of the former consistently resulted to higher success ratings. A comparison between the latest existing profile matrices (used in signal peptide and cleavage site predictions) showed only a slight improvement in the overall performance. Although the improvement is minimal, it makes a lot of difference when analyzing large datasets or genomic protein sequences.
Proceedings of the IEEE Region 10 International Conference (IEEE TENCON 2012). DOI: 10.1109/TENCON.2012.6412173