RESEARCH ARTICLE


Protein Secondary Structure Prediction Using RT-RICO: A Rule-Based Approach



Leong Lee*, 1, Jennifer L. Leopold1, Cyriac Kandoth1, Ronald L. Frank2
1 Department of Computer Science, Missouri University of Science and Technology, Rolla, MO, USA
2 Department of Biological Sciences, Missouri University of Science and Technology, Rolla, MO, USA


Article Metrics

CrossRef Citations:
4
Total Statistics:

Full-Text HTML Views: 789
Abstract HTML Views: 1181
PDF Downloads: 628
Total Views/Downloads: 2598
Unique Statistics:

Full-Text HTML Views: 455
Abstract HTML Views: 795
PDF Downloads: 472
Total Views/Downloads: 1722



Creative Commons License
© 2010 Lee et al.

open-access license: This is an open access article distributed under the terms of the Creative Commons Attribution 4.0 International Public License (CC-BY 4.0), a copy of which is available at: https://creativecommons.org/licenses/by/4.0/legalcode. This license permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

* Address correspondence to this author at the Department of Computer Science, Missouri University of Science and Technology, USA; Tel: +1 (573) 341-4491; E-mail: llkr4@mail.mst.edu


Abstract

Protein structure prediction has always been an important research area in biochemistry. In particular, the prediction of protein secondary structure has been a well-studied research topic. The experimental methods currently used to determine protein structure are accurate, yet costly both in terms of equipment and time. Despite the recent breakthrough of combining multiple sequence alignment information and artificial intelligence algorithms to predict protein secondary structure, the Q3 accuracy of various computational prediction methods rarely has exceeded 75%. In this paper, a newly developed rule-based data-mining approach called RT-RICO (Relaxed Threshold Rule Induction from Coverings) is presented. This method identifies dependencies between amino acids in a protein sequence and generates rules that can be used to predict secondary structure. RT-RICO achieved a Q3 score of 81.75% on the standard test dataset RS126 and a Q3 score of 79.19% on the standard test dataset CB396, an improvement over comparable computational methods.

Keywords: Data mining, Protein secondary structure prediction, Parallelization.