Automatic Arabic Document Classification via kNN

在线阅读 下载PDF 导出详情
摘要 Manyalgorithmshavebeenimplementedfortheproblemofdocumentcategorization.ThemajorityworkinthisareawasachievedforEnglishtext,whileaveryfewapproacheshavebeenintroducedfortheArabictext.ThenatureofArabictextisdifferentfromthatoftheEnglishtextandthepreprocessingoftheArabictextismorechallenging.ThisisduetoArabiclanguageisahighlyinflectionalandderivationallanguagethatmakesdocumentminingahardandcomplextask.Inthispaper,wepresentanAutomaticArabicdocumentsclassificationsystembasedonkNNalgorithm.Also,wedevelopanapproachtosolvekeywordsextractionandreductionproblemsbyusingDocumentFrequency(DF)thresholdmethod.TheresultsindicatethattheabilityofthekNNtodealwithArabictextoutperformstheotherexistingsystems.Theproposedsystemreached0.95micro-recallscoreswith850Arabictextsin6differentcategories.
机构地区 不详
出版日期 2008年02月12日(中国期刊网平台首次上网日期,不代表论文的发表时间)
  • 相关文献