摘要
Translationinitiationsites(TISs)areimportantsignalsincDNAsequences.InmanypreviousattemptstopredictTISsincDNAsequences,threemajorfactorsaffectthepredictionperformance:thenatureofthecDNAsequencesets,therelevantfeaturesselected,andtheclassificationmethodsused.Inthispaper,weexaminedifferentapproachestoselectandintegraterelevantfeaturesforTISprediction.Thetopselectedsignificantfeaturesincludethefeaturesfromthepositionweightmatrixandthepropensitymatrix,thenumberofnucleotideCinthesequencedownstreamATG,thenumberofdownstreamstopcodons,thenumberofupstreamATGs,andthenumberofsomeaminoacids,suchasaminoacidsAandD.Withthenumericaldatageneratedfromthesefeatures,differentclassificationmethods,includingdecisiontree,naiveBayes,andsupportvectormachine,wereappliedtothreeindependentsequencesets.Theidentifiedsignificantfeatureswerefoundtobebiologicallymeaningful,whiletheexperimentsshowedpromisingresults.
出版日期
2005年02月12日(中国期刊网平台首次上网日期,不代表论文的发表时间)