Journal ID : TRKU-10-12-2020-11359
[This article belongs to Volume - 63, Issue - 01]
Total View : 444


Abstract :

The popularity of Short Message Services (SMS) created a propitious environment for spamming. SMS spam filters are not, unfortunately, as easy to develop as email filters because of the rigid size of text messages and messages being laden with noisy elements, such as slangs and symbols. These inhibit effective training and classification of machine learning algorithms deployed for spam filtering. This study, therefore, proposes an enhanced SMS spam filter model that selects the best features from a text pre-process module, based on lexicography and semantic dictionaries, to normalize and expand incoming messages with the view of minimizing the noise element and combating the brevity of short text messages. A hybrid SMS spam filter model, which comprised of text pre-processing section, feature selection section and machine training and classification section, was developed. The model was simulated on the Scikit-learn library of the python programming platform. Evaluation was done using confusion matrix. Wilcoxon Signed-Ranks Test was used to determine the superiority of the proposed technique. A combination of ten machine learning algorithms was employed for validation. The study concluded that incorporation of feature selection techniques to normalized and expanded SMS messages size enhanced the performance of machine learning algorithms in the classification of SMS messages as either ham or spam.

Full article