Perbandingan Metode Klasifikasi Support Vector Machine dan K-Nearest Neighbor dalam Prediksi Penyakit Diabetes Mellitus
Main Article Content
Abstract
This study aims to compare the performance of the Support Vector Machine (SVM) and K-Nearest Neighbour (KNN) classification methods in predicting diabetes mellitus. The study employs a machine learning-based experimental approach, utilising a secondary dataset from Kaggle comprising 70,000 patient records and 34 variables, predominantly consisting of clinical data and individual health characteristics as the main variables. The data was split into training and test sets in an 80:20 ratio to ensure the model received sufficient training data whilst maintaining the objectivity of the testing. Model performance was evaluated using the accuracy, precision, recall, and F1-score metrics. The results of the study showed that SVM delivered superior performance with an accuracy of 78.21%, whilst KNN only achieved an accuracy of 47.23%. SVM also demonstrated better prediction stability across most classes. These findings indicate that SVM is more effective for predicting Diabetes Mellitus on datasets with a relatively large number of features and a complex class structure.