Using Supervised and Unsupervised Machine Learning Models to Analyze Students Academic Performance
Osondu Everestus Oguike1, Emmanuel Chukwudi Ukekwe2, Gabriel Abiodun Elufidodo3

1Osondu Everestus Oguike, Department of Computer Science, University of Nigeria, Nsukka, Nigeria.

2Emmanuel Chukwudi Ukekwe, Department of Computer Science, University of Nigeria, Nsukka, Nigeria.

3Gabriel Abiodun Elufidodo, Department of Computer Science, University of Nigeria, Nsukka, Nigeria.

Manuscript received on 24 July 2024 | Revised Manuscript received on 05 August 2024 | Manuscript Accepted on 15 September 2024 | Manuscript published on 30 September 2024 | PP: 1-6 | Retrieval Number: 100.1/ijsce.D36401404092 | DOI: 10.35940/ijsce.D3640.14040924

Open Access | Editorial and Publishing Policies | Cite | Zenodo | OJS | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: Examination result repositories generated by most universities can serve as machine learning datasets for training various models to gain insights from the data. These datasets can train multiple linear regression models to determine a student’s cumulative grade point average (CGPA), or the score that a student will get in specific courses. Additionally, classification-based supervised machine learning models can use these datasets to provide insights into the class result that a student will obtain. These insights can be invaluable for academic advising and early intervention. Moreover, these datasets can train clustering-based unsupervised machine learning models, such as the K-means clustering model, to understand how student results are grouped into various clusters. This information can be crucial for planning and evaluating the quality of the university. This paper uses the dataset of undergraduate students’ examination results from the Department of Computer Science at the University of Nigeria, Nsukka, to train three supervised machine learning models and one unsupervised machine learning model, utilizing Jupyter Notebook as the Python IDE. The training results showed acceptable accuracies of 91.5% for the Naïve Bayes model and 95.1% for the Decision Tree model. The linear regression model demonstrated a negligible root mean square error of 8.23×10−18, while the K-means clustering model exhibited an acceptable Silhouette metric of 0.12.

Keywords: Naïve Bayes Model, Decision tree Model, K-means Clustering Model, Linear Regression, Students’ Academic Performance.
Scope of the Article: Artificial Intelligence