摘要
Themainchallengesofdatastreamsclassificationincludeinfinitelength,concept-drifting,arrivalofnovelclassesandlackoflabeledinstances.Mostexistingtechniquesaddressonlysomeofthemandignoreothers.Soanensembleclassificationmodelbasedondecision-feedback(ECM-BDF)ispresentedinthispapertoaddressallthesechallenges.Firstly,adatastreamisdividedintosequentialchunksandaclassificationmodelistrainedfromeachlabeleddatachunk.Toaddresstheinfinitelengthandconcept-driftingproblem,afixednumberofsuchmodelsconstituteanensemblemodelEandsubsequentlabeledchunksareusedtoupdateE.Todealwiththeappearanceofnovelclassesandlimitedlabeledinstancesproblem,themodelincorporatesanovelclassdetectionmechanismtodetectthearrivalofanovelclasswithouttrainingEwithlabeledinstancesofthatclass.Meanwhile,unsupervisedmodelsaretrainedfromunlabeledinstancestoprovideusefulconstraintsforE.AnextendedensemblemodelExcanbeacquiredwiththeconstraintsasfeedbackinformation,andthenunlabeledinstancescanbeclassifiedmoreaccuratelybysatisfyingthemaximumconsensusofEx.ExperimentalresultsdemonstratethattheproposedECM-BDFoutperformstraditionaltechniquesinclassifyingdatastreamswithlimitedlabeleddata.
出版日期
2014年01月11日(中国期刊网平台首次上网日期,不代表论文的发表时间)