Comparison of Term Weighting Techniques in Spam SMS Detection


Kural O. E., Demirci S.

28th Signal Processing and Communications Applications Conference (SIU), ELECTR NETWORK, 5 - 07 Ekim 2020 identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Basıldığı Ülke: ELECTR NETWORK
  • Anahtar Kelimeler: term reweighting, spam analysis
  • Ondokuz Mayıs Üniversitesi Adresli: Evet

Özet

Short message services are one of the most widely used communication services. The increased use of mobile devices and the lowering of SMS costs by operators enable short message services to remain popular. However, this popularity causes tens of users to be exposed to spam SMS every day. The term spam can simply be referred to as unwanted messages by users. Although organizations take measures against spam SMS and there are widely used spam SMS filtering systems, the problem of spam SMS is becoming widespread. There are many studies in the literature for the detection of spam SMS, but new and efficient methods are still needed. In this study, TF-IDF and RF term weighting methods which are frequently used in text mining applications were compared in order to classify spam SMS and to use the limited content of SMSs more meaningfully. The vectors obtained from the data set were weighted by TF-IDF and RF term weighting methods and classified with 5 different classifiers popular in this field.