Two New Feature Extraction Methods for Text Classification: TESDF and SADF


Kılıç E., Ateş N., Karakaya A., Şahin D. Ö.

23nd Signal Processing and Communications Applications Conference (SIU), Malatya, Türkiye, 16 - 19 Mayıs 2015, ss.475-478 identifier identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Doi Numarası: 10.1109/siu.2015.7129862
  • Basıldığı Şehir: Malatya
  • Basıldığı Ülke: Türkiye
  • Sayfa Sayıları: ss.475-478
  • Anahtar Kelimeler: text classification, term weighting, inverse document frequency
  • Ondokuz Mayıs Üniversitesi Adresli: Evet

Özet

In this study, two new document weighting methods are proposed based on term frequency-inverse document frequency (TF-IDF) generally used in text mining methods. Also, insignificance of the verb in text classification which will be a new method in pre-processing have been put forward and tested. The better results were observed through using these methods when these methods compare with other method, It was observed that the performance rate hardly change and the data size which was processed decreased by omitting verbs of texts.