Prediction of splice site using adaBoost with a new sequence encoding approach
MetadataShow full item record
The Biological sequence data are increasing rapidly, so there is a vital need of effective method for gene detection. Predicting of splice site is an important part of gene finding. Therefore, attempts to improve the prediction accuracy of the computational methods for splice sites detection continue. In this paper we propose a hybrid algorithm for splice sites prediction by combining AdaBoost classifier with a novel nucleotide encoding method, namely FDDM. Our encoding method provides frequency difference between the true sites and false sites (FD) along with distance measure (DM). The proposed method produces an improvement in comparison with the result of current methods such as MM1-SVM, Reduced MM1-SVM, SVM-B, LVMM, DM-SVM, DM2-AdaBoost and MSC+ Pos(+APR)-SVM, when applied to the HS3D dataset with repeated 10-fold cross validation. In addition, for demonstrating the stability of the method, we also applied it to NN269 dataset. The obtained results indicate that the new method is practicable and efficient.