An algorithm for rapid noncoding RNA sequence-structure alignment
1.College of Electrical Engineering, Zhejiang University, Hangzhou, Zhejiang 310027, China; 2.College of Electronic and Information Engineering, Suzhou Vocational University, Suzhou, Jiangsu 215104, China
Abstract:In order to improve computational efficiency of the noncoding RNA sequence search software, the alignment results of family members and the covariance model were analyzed. The length distribution of secondary structure of ncRNA was introduced to propose the structure unit length limit algorithm. Based on the length distribution of structure unit in the secondary structure, the numbers of insertion and deletion during the evolution were restricted. The computer program of the proposed approach was completed by C++, and the performance was tested by searching a few ncRNAs in genomes. The experimental results show that compared with the CMbased search, the new model can significantly speed up the searching of ncRNAs in genomes with comparable search accuracy to that of covariance CMbased approach. The speeding up by the new model is significant for the sequences with a large number of nucleotides. The searching speed of the new approach is 90.76 times faster than that of CMbased search algorithm on Lin_4 with the same search accuracy in terms of sensitivity and specificity.