Data streams classification with ensemble model based on decision-feedback

(整期优先)网络出版时间:2014-01-11
/ 1
Themainchallengesofdatastreamsclassificationincludeinfinitelength,concept-drifting,arrivalofnovelclassesandlackoflabeledinstances.Mostexistingtechniquesaddressonlysomeofthemandignoreothers.Soanensembleclassificationmodelbasedondecision-feedback(ECM-BDF)ispresentedinthispapertoaddressallthesechallenges.Firstly,adatastreamispidedintosequentialchunksandaclassificationmodelistrainedfromeachlabeleddatachunk.Toaddresstheinfinitelengthandconcept-driftingproblem,afixednumberofsuchmodelsconstituteanensemblemodelEandsubsequentlabeledchunksareusedtoupdateE.Todealwiththeappearanceofnovelclassesandlimitedlabeledinstancesproblem,themodelincorporatesanovelclassdetectionmechanismtodetectthearrivalofanovelclasswithouttrainingEwithlabeledinstancesofthatclass.Meanwhile,unsupervisedmodelsaretrainedfromunlabeledinstancestoprovideusefulconstraintsforE.AnextendedensemblemodelExcanbeacquiredwiththeconstraintsasfeedbackinformation,andthenunlabeledinstancescanbeclassifiedmoreaccuratelybysatisfyingthemaximumconsensusofEx.ExperimentalresultsdemonstratethattheproposedECM-BDFoutperformstraditionaltechniquesinclassifyingdatastreamswithlimitedlabeleddata.