Please submit manuscripts in either of the following two submission systems

    ScholarOne Manuscripts

  • ScholarOne
  • 勤云稿件系统

  • 登录

Search by Issue

  • 2024 Vol.31
  • 2023 Vol.30
  • 2022 Vol.29
  • 2021 Vol.28
  • 2020 Vol.27
  • 2019 Vol.26
  • 2018 Vol.25
  • 2017 Vol.24
  • 2016 vol.23
  • 2015 vol.22
  • 2014 vol.21
  • 2013 vol.20
  • 2012 vol.19
  • 2011 vol.18
  • 2010 vol.17
  • 2009 vol.16
  • No.1
  • No.2

Supervised by Ministry of Industry and Information Technology of The People's Republic of China Sponsored by Harbin Institute of Technology Editor-in-chief Yu Zhou ISSNISSN 1005-9113 CNCN 23-1378/T

期刊网站二维码
微信公众号二维码
Related citation:CongShuai,ZHANG Ji-bin,XU Zhi-ming,WANG Yu-ying.Feature selection algorithm for text classification based on improved mutual information[J].Journal of Harbin Institute Of Technology(New Series),2011,18(3):144-148.DOI:10.11916/j.issn.1005-9113.2011.03.027.
【Print】   【HTML】   【PDF download】   View/Add Comment  Download reader   Close
←Previous|Next→ Back Issue    Advanced Search
This paper has been: browsed 752times   downloaded 955times 本文二维码信息
码上扫一扫!
Shared by: Wechat More
Feature selection algorithm for text classification based on improved mutual information
Author NameAffiliation
CongShuai School of Computer Science and Technology,Harbin Institute of Technology,Harbin 150001,China 
ZHANG Ji-bin School of Computer Science and Technology,Harbin Institute of Technology,Harbin 150001,China 
XU Zhi-ming School of Computer Science and Technology,Harbin Institute of Technology,Harbin 150001,China 
WANG Yu-ying School of Computer Science and Technology,Harbin Institute of Technology,Harbin 150001,China 
Abstract:
In order to solve the poor performance in text classification when using traditional formula of mutual information (MI),a feature selection algorithm were proposed based on improved mutual information.The improved mutual information algorithm,which is on the basis of traditional improved mutual information methods that enhance the MI value of negative characteristics and feature’s frequency,supports the concept of concentration degree and dispersion degree.In accordance with the concept of concentration degree and dispersion degree,formulas which embody concentration degree and dispersion degree were constructed and the improved mutual information was implemented based on these.In this paper,the feature selection algorithm was applied based on improved mutual information to a text classifier based on Biomimetic Pattern Recognition and it was compared with several other feature selection methods.The experimental results showed that the improved mutual information feature selection method greatly enhances the performance compared with traditional mutual information feature selection methods and the performance is better than that of information gain.Through the introduction of the concept of concentration degree and dispersion degree,the improved mutual information feature selection method greatly improves the performance of text classification system.
Key words:  text classification  feature selection  improved mutual information  Biomimetic Pattern Recognition
DOI:10.11916/j.issn.1005-9113.2011.03.027
Clc Number:TP391.1
Fund:

LINKS