简介:Acompressionalgorithmisproposedinthispaperforreducingthesizeofsensordata.Byusingadictionary-basedlosslesscompressionalgorithm,sensordatacanbecompressedefficientlyandinterpretedwithoutdecompressing.Thecorrelationbetweenredundancyofsensordataandcompressionratioisexplored.Further,aparallelcompressionalgorithmbasedonMapReduce[1]isproposed.Meanwhile,datapartitionerwhichplaysanimportantroleinperformanceofMapReduceapplicationisdiscussedalongwithperformanceevaluationcriteriaproposedinthispaper.Experimentsdemonstratethatrandomsamplerissuitableforhighlyredundantsensordataandtheproposedcompressionalgorithmscancompressthosehighlyredundantsensordataefficiently.
简介:Privacypreservingdataminingalgorithmsarecrucialforthepersonaldataanalysis,suchasmedicalandfinancialrecords.Thispaperfocusesonfeatureselectionandproposesanewprivacypreservingdistributedalgorithm,whichcaneffectivelyselectfeaturesbasedondifferentialprivacyandGiniindexundertheMapReduceframework.Atthesametime,thetheoreticanalysisforprivacyguaranteeisalsopresented.Someexperimentsareconductedonbench-markdatasets,thesimulationresultsindicatethatduringtheselectionofimportantfeatures,theproposedalgorithmcanpreserveprivacyinformationtoacertainextentwithlesstimecostthanoncentralizedcounterpart.