Journal ID : TRKU-28-06-2020-10833
[This article belongs to Volume - 62, Issue - 06]
Total View : 351

Title : Dimension Reduction Using Core and Reduct to Improve Fuzzy C-Means Clustering Performance

Abstract :

Large-volume data is very difficult to find hidden patterns in the data. The complexity and computational time for analyzing large volumes of data to obtain important information are very dependent on the number of data and variables in a dataset. Big data intersects with incomplete data. This study aims to develop a method of data clustering that is sensitive to missing values in big data that is fast and efficient. This research develops data clustering using fuzzy c-means clustering methods. This method can accommodate the incompleteness of data by calculating the datum expertise in the dataset. Dimension reduction is applied to reduce dimensions in a data set while maintaining important information in the dataset. Core and Reduct which is one of the concepts in the rough set theory was chosen to reduce and leave only the core of a dataset. Core and Reduct are applied to look for core data patterns and select important variables in the data. The results showed that the application of Core and Reduct in the Fuzzy C-Means clustering could shorten the computational time and reduce the value of objective functions until the remaining 43.49%. At the same time, the quality of the clusters produced can be better with relatively unchanged purity and far better accuracy. The combined advantage of this method is that it has a better performance compared to the standard fuzzy c-means clustering

Full article