- Research
- Open access
- Published:
Predictive models and determinants of mortality among T2DM patients in a tertiary hospital in Ghana, how do machine learning techniques perform?
BMC Endocrine Disorders volume 25, Article number: 9 (2025)
Abstract
Background
The increasing prevalence of type 2 diabetes mellitus (T2DM) in lower and middle – income countries call for preventive public health interventions. Studies from Africa including those from Ghana, consistently reveal high T2DM-related mortality rates. While previous research in the Ho municipality has primarily examined risk factors, comorbidity, and quality of life of T2DM patients, this study specifically investigated mortality predictors among these patients.
Method
The study was retrospective involving medical records of T2DM patients. Data extracted included mortality outcome (dead or alive), sociodemographic characteristics (age, sex, marital status, educational level, occupation and location), family history of diseases (diabetes, cardiovascular disease (CVD), or asthma), lifestyle (smoking and alcohol intake), comorbidities (such as skin infections, sickle cell disease, urinary tract infections, and pneumonia) and complications of diabetes (CVD, nephropathy, neuropathy, foot ulcers, and diabetic ketoacidosis) were analyzed using Stata version 16.0 and Python 3.6.1 programming language. Both descriptive and inferential statistics were done to describe and build predictive models respectively. The performance of machine learning (ML) techniques such as support vector machine (SVM), decision tree, k nearest neighbor (kNN), eXtreme Gradient Boosting (XGBoost) and logistic regression were evaluated using the best-fitting predictive model for T2DM mortality.
Results
Of the 328 participants, 183 (55.79%) were female, and the percentage of mortality was 11.28%. A 100% mortality was recorded among the T2DM patients with sepsis (p-value = 0.012). T2DM in-patients were 3.83 times as likely to die [AOR = 3.83; 95% CI: (1.53–9.61)] if they had nephropathy compared to T2DM in-patients without nephropathy (p-value = 0.004). The full model which included sociodemographic characteristics, family history, lifestyle variables and complications of T2DM had the best prediction of T2DM mortality outcome (ROC = 72.97%). The accuracy for (test and train datasets) were as follows: (90% and 90%), (100% and 100%), (90% and 90%), (90% and 88%) and (88% and 90%) respectively for the various ML classification techniques: logistic regression, Decision tree classifier, kNN classifier, SVM and XGBoost.
Conclusion
This study found that all in-patients with sepsis died. Nephropathy was the identified significant predictor of T2DM mortality. Decision tree classifier provided the best classifying potential.
Introduction
The rise in Non-Communicable Diseases (NCDs), notably Type 2 Diabetes Mellitus (T2DM), in developing nations coupled with the challenges faced by healthcare systems in managing this growing burden [1], highlights the urgent need for preventive public health measures [2]. This concerning trend of rising T2DM cases is especially evident among two specific demographic groups: older adults and obese young individuals [3,4,5]. The reasons behind this trend could be biological or closely linked to shifts in lifestyle and economic improvements [4, 6]. In Ghana, the reported prevalence of T2DM stands at 3.95% among individuals aged 50 years or older [4].
It is crucial to emphasize that T2DM, along with its associated complications such as cardiovascular diseases (CVDs) and chronic kidney disease (CKD), significantly contribute to mortality rates [7,8,9]. In fact, T2DM alone ranks ninth among the leading causes of death worldwide, resulting in over 1 million deaths annually [2]. This burden is mirrored in Africa, as exemplified by Nigeria, where mortality and case fatality rates of T2DM are reported at 30.2 per 100,000 population and 22.0%, respectively [10]. A study conducted in Ghana further underscores this trend, revealing that over a 31-year period (1983–2012), hospitalized case fatality rates due to diabetic conditions surged from 7.6 per 1000 deaths to 30 per 1000 deaths [11]. The authors of the study also noted an average of 18.5% of deaths occurring approximately every 28 days [11].
In the Ho Municipality, a significant number of annual new diabetes cases, totaling 511, were reported in 2021 [12]. While efforts are made to improve the well-being of individuals living with T2DM in the Municipality, the research focus has primarily centered on addressing various associated risk factors, comorbidity, and enhancing their quality of life [13,14,15]. However, a noticeable gap in the existing body of knowledge pertains to the exploration of predictors of mortality among T2DM patients in the Volta Region. Evidence suggests that diabetes-related complications, such as stroke, nephropathy, neuropathy, and retinopathy, as well as comorbidities, lifestyle factors like smoking, and a family history of cardiovascular disease (CVD), are strongly associated with an increased risk of mortality [16,17,18,19]. Additionally, machine learning (ML) techniques have demonstrated superior accuracy in predicting T2DM outcomes compared to conventional predictive methods [20]. The current study thus, sought to determine how T2DM in – patients’ family medical history, lifestyle and complications from T2DM predict mortality and evaluate the predictive potential of ML techniques such as Decision Tree (DT), k Nearest Neighbour (kNN), Support Vector Machine (SVM) and XGBoost.
Materials and methods
Study design
The study retrospectively analyzed the medical records of Type 2 Diabetes Mellitus (T2DM) patients treated at the Ho Teaching Hospital (HTH) between January 2017 and November 2022.
Study area
The study was carried out at HTH. The hospital facility is the main referral center in the Volta Region. HTH is the fifth public Teaching Hospital in Ghana and serves the needs of the region and beyond. It has seven directorates (Medical Affairs, Administration & Support Services, Nursing Administration, Human Resources, Research, Innovation, Planning, Monitoring and Evaluation, Finance and Pharmacy) [21]. The Hospital has over 300-bed capacity to cater for the health needs of patients [21]. Among the clinical department in the facility include, Internal Medicine, Surgical, Obstetrics & Gynecology, Child Health and Public Health [21]. The diabetic services for patients include consulting with the dietician and visiting the general clinic. Specific laboratory investigations such as FBG, BMI, lipid profile, urine glucose, kidney function test, and liver function test are carried out in addition to checking for compliance with medication, monitoring co-morbidity and complication needing specialist care.
Study population
The study population comprised all the accessible medical records of T2DM in-patients both males and females aged 18 years and older who received healthcare services at HTH between January 2017 and November 2022.
Inclusion and exclusion criteria
Inclusion criteria
The electronic and manual medical records of in-patients aged 18 years and above who had complete sociodemographic characteristics data as well as lifestyle variables, complications of T2DM and mortality outcome within the stipulated period for the study (January 2017 to November 2022) were included in the study.
Exclusion criteria
Patients with Type 1 diabetes Mellitus, gestational diabetes, T2DM out – patients as well as the T2DM in-patients whose data on the lifestyle variables, complications of T2DM and mortality outcome could not be found in the medical records (electronic and manual) were excluded.
Sample size
The sample size for the study was determined by using the Cochran formula [22] for cross-sectional studies: \(\:n=\frac{{P\left(1-P\right)\:x\:(Z\alpha\:/2)}^{2}}{{e}^{2}}\),
Where:
n is the estimated sample size,
Zα/2 is the reliability coefficient (1.96 at the 95% confidence level),
p is the national mortality = 3.39% and.
e is margin of error allowable for this study (5%).
By substituting the figures into the formula,
A complete enumeration of the study population was employed where all accessible medical records (electronic and manual) of T2DM in – patients, aged 18 years and above who accessed health care at the HTH from January 2017 to November 2022 was done resulting in a total of 328 samples (241 electronic and 87 manual records) used for the study.
Study variables
The study’s dependent variable was the mortality outcome (categorized as alive or dead) among patients with T2DM. Independent variables included sociodemographic characteristics such as sex, age, marital status, place of residence, occupation, and educational level, alongside family medical history variables like a family history of diabetes, cardiovascular disease (CVD), or asthma. Lifestyle factors, including smoking and alcohol intake, comorbidities such as skin infections, sickle cell disease, urinary tract infections (UTIs), and pneumonia were also considered, as well as complications associated with T2DM, including cardiovascular disease (CVD), nephropathy, neuropathy, foot ulcers, and diabetic ketoacidosis (DKA).
Data retrieval and management
Data was retrieved using Microsoft (MS) Excel version 2016 from the electronic and manual patient folders. A data extraction sheet was used to capture data on sociodemographic characteristics (age, sex, marital status, family history, educational level, occupation and location), lifestyle variables (smoking and alcohol intake), family history of diabetes, cardiovascular disease (CVD), or asthma and diabetic complication (CVD, nephropathy, neuropathy, foot ulcers, and DKA). The resulting data collated was coded and cleaned in MS Excel and password protected. Mortality outcome was coded 1 and 0 (1 = Dead and 0 = Alive).
Data analysis
Data extracted were entered into MS Excel version 2016 and analyzed using Stata version 16.0 and Python 3.6.1 programming language. Quantitative variables were presented as Mean ± SD. Frequencies and percentages were used to summarize categorical variables. To understand the associations of categorical independent variables with the outcome, Chi-square test was done. Univariable logistic regression was also used to obtain the crude strength of association between mortality and the independent variables. Multiple logistic regression was done to obtain the adjusted odds ratio. A p-value of 0.05 was considered statistically significant. The predictive models were evaluated using the area under the ROC curve. Multicollinearity was tested using generalized variance inflation factor for logistic regression models. The best-performing regression model was evaluated using scikit learn ML module in Python. The dataset was divided into test (70%) and train (30%). Four classifiers, decision tree, k nearest neighbour (kNN), Support Vector Machine (SVM) and eXtreme Gradient Boosting (XGBoost) were used as machine learners for the classification. To improve the interpretability of our ML model, we utilized SHapley Additive exPlanations (SHAP), a recommended method for enhancing transparency in ML techniques [47].
Machine learning techniques
Machine learning, a subset of artificial intelligence, focuses on developing computer systems that can identify patterns in training data to execute classification and prediction tasks on new, unseen data [23]. This field integrates tools from statistics, data mining, and optimization to generate predictive models.
In this study, four ML techniques were employed:
-
1.
Decision Tree (DT): It is a type of supervised ML algorithm that uses a tree-like structure to model decisions based on input features. The tree is constructed by iteratively selecting thresholds or splitting criteria for the input features that best separate the data into distinct classes or values. Each internal node represents a decision point based on a specific feature threshold, while the branches correspond to the possible outcomes of that decision. The terminal nodes, or leaves, indicate the predicted target class or value [24].
-
2.
k nearest neighbour (kNN): It is a supervised, non-parametric algorithm based on the “things that look alike” idea used for classification and regression tasks [25]. It predicts outcomes based on the k nearest data points in the feature space, measured using distance metrics such as Euclidean Manhattan, or Minkowski distance. For classification, it assigns the most common class among the neighbors, while for regression, it averages their values [25].
-
3.
Support Vector Machine (SVM): It is a non – parametric ML algorithm that uses linear and non-linear functions to map input feature vectors into a higher-dimensional feature space. By doing so, SVM identifies an optimal hyperplane or decision boundary that separates classes or predicts values with maximum margin, ensuring robust generalization to new data [25].
-
4.
eXtreme Gradient Boosting (XGBoost): It is an open-source ML algorithm that uses gradient boosted decision trees for supervised learning tasks. Developed by Tianqi Chen from the University of Washington [26].
Ethical consideration
This study’s protocol obtained ethical approval from the Research and Ethics Committee of the Ho Teaching Hospital, identified by Protocol ID No: HTH-REC [20] FC_2022. Additionally, permission was granted by the facility’s records department prior to commencing data collection.
Results
The study recorded a mortality of 11.28% among in-patients diagnosed with T2DM at the Ho Teaching Hospital from 2017 to 2022 (Fig. 1).
Table 1 shows the sociodemographic characteristics of T2DM patients seeking health care at the Ho Teaching Hospital. The study observed a female preponderance of 183(55.79%) and the average age of participants was 58.61 ± 14.64 years. More than half were married 221(67.38%), located in urban settlements 187(57.01%) and formally employed 174(53.05%). One hundred and thirty-two of the T2DM patients representing 40.24% (95% CI: 35.05 − 45.67%) had primary education while the minority did not have any form of formal education 45(13.72%). The patients also had family history of diabetes 54(16.48%), cardiovascular diseases 67(20.43%), asthma 5(1.52%) and both cardiovascular diseases and diabetes 34(10.37%) as well as cardiovascular diseases, diabetes and asthma 1(0.35%). The lifestyle characteristics of study subjectss also showed that 65(19.82%) and 20(6.1%) were current alcoholic and smoker respectively.
Table 2 shows the chi-square test of association and binary logistic regression of sociodemographic factors and family history of disease with mortality outcome. None of the sociodemographic factors and family history of disease were significantly associated with mortality.
The chi-square test of association and univariate binary logistic regression of T2DM complications and comorbidities with mortality outcome showed that among the T2DM in-patients’ complications and comorbidity, nephropathy, as a complication and sepsis were significantly associated with diabetic mortality. Higher proportions of those who died presented with nephropathy 25% compared to those without nephropathy 9.15% (p-value = 0.002). The crude odds ratio shows that those having nephropathy had a threefold increased odds of mortality compared to those without [cOR = 3.31, 95%CI (1.50–7.31); p-value = 0.003]. The only two patients who presented with sepsis in the study died (p-value = 0.012) [Table 3].
After adjusting for sociodemographic and all other factors, the only statistically significant factor contributing to mortality among the T2DM in-patients seeking health care at the Ho Teaching Hospital was nephropathy. T2DM in-patients with nephropathy had 3.83-fold odds of death [95% CI: (1.53–9.61)] compared to T2DM in-patients without nephropathy. This was statistically significant at a p-value of 0.004 (Table 4).
As seen in Fig. 2, the area under the ROC curve was highest for Model 3 (ROC = 72.97%) among the three models (Model 1 = 67.03% and Model 2 = 67.85%), making it the preferred model and indicating a good predictive ability of the fitted model to predict mortality among T2DM patients.
Examination of the GVIF values for the preferred model (model 3) showed that all the GVIF values were far less than the cut-off value of 10 and the mean GVIF was 1.23 which is less than 6, indicating no multicollinearity in the model (Table 5).
The performance of logistic regression, decision tree, kNN, SVM and XGBoost were evaluated using the best – performing predictive model. The accuracy results for (test and train datasets) were as follows: (90% and 90%), (100% and 100%), (90% and 90%), (90% and 88%) and (88% and 90%) respectively for the various classification techniques: logistic regression, Decision tree classifier, kNN classifier,SVM and XGBoost. Additionally, the precision, recall and F1 score for decision tree were almost 1. Thus, making the decision tree the best classifier (Table 6).
Fig. 3 reveals that nephropathy is the most influential feature in predicting mortality, with having a kidney disease strongly influencing an increased odds of mortality prediction.
Discussion
The study aimed to identify predictors of mortality and assess the performance of four ML techniques in predicting mortality among T2DM in-patients in a tertiary hospital in Ghana. The findings revealed an overall T2DM mortality of 11.28% (37 deaths). Notably, all patients diagnosed with sepsis died, and nephropathy emerged as an independent predictor of mortality within this population. Decision tree classifier provided the best prediction potential relative to k nearest neighbor, support vector machine and eXtreme Gradient Boosting.
The proportion of mortality observed among T2DM patients in this study, while concerning, may reflect a complex interplay of factors within the healthcare system and patient behaviors. On one hand, it could indicate a relatively effective healthcare delivery system in managing T2DM patients, particularly considering their advanced age and the inherently higher mortality risk associated with the condition [11, 27]. Conversely, it might suggest systemic issues such as delayed diagnosis and management, potentially stemming from suboptimal health-seeking behaviors influenced by various factors including demographic characteristics, economic constraints, limited healthcare access, and sociocultural norms [28,29,30]. In relation to our study findings, previous research has highlighted significant variations in mortality rates among patients with T2DM across different countries. For example, Denmark reported a notably higher T2DM mortality percentage of 68% [31] whereas the United Kingdom (UK) reported a substantially lower percentage of 1.93% [32]. This variation in the study outcomes may be due to differences in geographical location, healthcare system, study design or sample size. While our study was a cross-sectional study using secondary data with a sample size of 328; the studies in Denmark [31] and UK [32] were cohort studies that employed sample sizes of 283 and 44,230 T2DM records respectively.
Consistent with accumulated evidence supporting diabetic complication and comorbidity with mortality [33,34,35], this study finds nephropathy and sepsis to be significantly associated with T2DM mortality. A 100% mortality outcome was observed with T2DM patients who presented with sepsis during the period of the study. A retrospective cohort study by Hsieh et al. [36] also highlighted the influences of sepsis on mortality among T2DM patients. A plausible explanation for this observation in our study could be that individuals with T2DM face an increased risk of succumbing to infectious diseases [37]. This heightened susceptibility is linked to weakened immunity resulting from prolonged poor glycemic control [38]. Consequently, when sepsis occurs in T2DM patients, it can lead to a fatal outcome as was seen in a study done in Lithuania [39]. The authors found the highest cause-specific standardized mortality ratio (SMR) to be infection-related causes (SMR = 1.44), particularly septicemia (SMR = 1.78) [39].
Another key finding predicting mortality in this study is nephropathy. Nephropathy was reported among 44(13.41%) of the T2DM patients. Twenty five percent of the T2DM patients with nephropathy died as compared to the 9.15% who were without nephropathy. After adjusting for all factors that could potentially confound this association, T2DM patients with nephropathy had 3.83-fold odds of death [95% CI: (1.53–9.61)] compared to T2DM patients without nephropathy. The findings from this study is similar to a study by González-Pérez et al. [35] who reported that every year, one out of every 20 T2DM patients with Diabetic Kidney Disease (DKD) died. Afkarian et al. [40] also found an absolute mortality risk difference with the reference group of 23.4% after adjusting for demographics among diabetics with kidney disease. Considering the adverse mortality outcome associated with nephropathy, screening for kidney disease among T2DM patients would be helpful in the management of T2DM.
Recognizing the variability in mortality risk among T2DM patients, future research may focus on personalized medicine approaches that tailor treatment strategies to individual patient characteristics, including age, sex, family history, lifestyle, presence of complications, and socioeconomic status. Also, T2DM patients with nephropathy and sepsis would benefit from health education and support to empower them to actively participate in their care and manage their condition.
Review of existing literature identified sociodemographic characteristics, family history, lifestyle variables and complications of T2DM as independent predictors of mortality among T2DM patients [16,17,18,19]. Three models were evaluated using first the sociodemographic and family history variables, model 2 used sociodemographic, family history and lifestyle variables while the last model used all variables. The findings of this study demonstrate the area under the ROC curve was highest for Model 3 (ROC = 72.97%) among the three models (Model 1 = 67.03% and Model 2 = 67.85%), making it the preferred model and indicating a good predictive ability of the fitted model to predict mortality among T2DM patients. This implies that having a holistic medical history of T2DM patients would help in making better predictions on their health outcome, especially mortality outcome. This finding is consistent with the study done by of Lee et al. [41] where a multiparametric model that consisted of different variables of T2DM patients predicted all-cause mortality more accurately.
The current study presents an application of ML techniques for predicting mortality among T2DM patients. This approach aligns with recent trends in healthcare analytics, where ML models have demonstrated superior predictive accuracy compared to traditional statistical methods [41]. This trend is evident across various healthcare domains, including coronary artery disease risk prediction using XGBoost [42], T2DM prediction employing SVM in the studies by Yue et al. [43] and Georga et al. [44], and mortality prediction in T2DM and hypertensive patients using ML techniques [45, 46]. Among the ML algorithms tested in our study, the Decision Tree classifier demonstrated the highest predictive potential, outperforming k-Nearest Neighbors (kNN), SVM, logistic regression, and XGBoost for this specific population. To enhance the interpretability of our ML model, we employed SHapley Additive exPlanations (SHAP), an approach recommended for improving transparency in ML techniques [47]. The SHAP analysis consistently identified nephropathy as a significant predictor of increased odds of mortality among T2DM patients. These findings underscore the potential of ML techniques in healthcare predictive analytics, offering improved accuracy and interpretability. By leveraging such advanced analytical tools, healthcare providers can develop more targeted interventions and personalized care strategies, ultimately improving patient outcomes and resource allocation in T2DM management.
While this study provides valuable insights, it is essential to acknowledge its limitations. Due to the retrospective design of the study, certain variables, including dietary habits and exercise patterns, which could have yielded valuable lifestyle insights, were not available. Additionally, it is essential to recognize that retrospective studies have limitations in establishing causality, primarily because they cannot accurately determine the temporal sequence of events. Consequently, care should be taken in generalizing this finding to different time periods, particularly if there have been shifts in exposures, outcomes, or other pertinent factors over time. Furthermore, it would have been helpful to include data from patients about their current medication for managing other comorbidities present. Finally, the current study was based on a single-site analysis; thus, findings may not be applicable to the whole country.
Conclusion
This study found nephropathy as the significant predictor of T2DM in-patient mortality. Also, all patients who were having sepsis during the period under study died. For better prediction of mortality outcome, a holistic assessment of sociodemographic characteristics, family history, lifestyle variables and complications of T2DM is required. Decision tree classifier provided the best classifying potential. Medics and researchers could use this predictive model in the long term to improve the overall mortality outcome of T2DM patients.
Data availability
The data used to support the findings of this study are available upon request to the corresponding author.
Abbreviations
- ADA:
-
American Diabetes Association
- AI:
-
Artificial intelligence
- CDC:
-
Center for Disease Control and prevention
- CDK:
-
Chronic Kidney Disease
- CVD:
-
Cardiovascular Disease
- DR:
-
Diabetic Retinopathy
- DT:
-
Decision Tree
- IDF:
-
International Diabetes Federation
- QPSO:
-
Quantum Particle Swarm Optimization
- RF:
-
Random Forest
- T2DM:
-
Type 2 Diabetes Mellitus
- WHO:
-
World Health Organization
- SVM:
-
Support Vector Machine
References
Kabir A, Karim MN, Islam RM, Romero L, Billah B. Health system readiness for non-communicable diseases at the primary care level: a systematic review. BMJ Open. 2022;12(2):e060387.
Khan MAB, Hashim MJ, King JK, Govender RD, Mustafa H, Al Kaabi J. Epidemiology of type 2 diabetes - global burden of Disease and Forecasted trends. J Epidemiol Glob Health. 2020;10(1):107–11.
Ayah R, Joshi MD, Wanjiru R, Njau EK, Otieno CF, Njeru EK, et al. A population-based survey of prevalence of diabetes and correlates in an urban slum community in Nairobi, Kenya. BMC Public Health. 2013;13(1):371.
Gatimu SM, Milimo BW, Sebastian MS. Prevalence and determinants of diabetes among older adults in Ghana. BMC Public Health. 2016;16(1):1174.
Al Amiri E, Abdullatif M, Abdulle A, Al Bitar N, Afandi EZ, Parish M, et al. The prevalence, risk factors, and screening measure for prediabetes and diabetes among Emirati overweight/obese children and adolescents. BMC Public Health. 2015;15:1298.
de-Graft Aikins A, Addo J, Ofei F, Bosu W, Agyemang C. Ghana’s burden of chronic non-communicable diseases: future directions in research, practice and policy. Ghana Med J. 2012;46(2 Suppl):1–3.
McEwen LN, Karter AJ, Waitzfelder BE, Crosson JC, Marrero DG, Mangione CM, et al. Predictors of mortality over 8 years in type 2 diabetic patients: translating Research Into Action for Diabetes (TRIAD). Diabetes Care. 2012;35(6):1301–9.
Mayyas FA, Ibrahim KS. Predictors of mortality among patients with type 2 diabetes in Jordan. BMC Endocr Disorders. 2021;21(1):200.
Zhou JJ, Koska J, Bahn G, Reaven P. Glycaemic variation is a predictor of all-cause mortality in the veteran affairs Diabetes Trial. Diab Vasc Dis Res. 2019;16(2):178–85.
Adeloye D, Ige JO, Aderemi AV, Adeleye N, Amoo EO, Auta A, et al. Estimating the prevalence, hospitalisation and mortality from type 2 diabetes mellitus in Nigeria: a systematic review and meta-analysis. BMJ Open. 2017;7(5):e015424.
Sarfo-Kantanka O, Sarfo FS, Oparebea Ansah E, Eghan B, Ayisi-Boateng NK, Acheamfour-Akowuah E. Secular trends in admissions and Mortality Rates from Diabetes Mellitus in the Central Belt of Ghana: a 31-Year review. PLoS ONE. 2016;11(11):e0165905.
HMHD. Report on T2DM cases 2021. 2021.
Osei-Yeboah J, Owiredu W, Norgbe G, Obirikorang C, Lokpo S, Ashigbi E, et al. Physical activity Pattern and its association with glycaemic and blood pressure control among people living with diabetes (PLWD) in the Ho Municipality, Ghana. Ethiop J Health Sci. 2019;29(1):819–30.
Osei-Yeboah J, Lokpo SY, Owiredu WK, Johnson BB, Orish VN, Botchway F et al. Medication adherence and its association with Glycaemic control, blood pressure control, glycosuria and proteinuria among people living with diabetes (PLWD) in the ho municipality, Ghana. The Open Public Health Journal. 2018;11(1).
Lokpo SY, Owiredu WK, Agordoh P, Agboli E, Amoo LNA, Noagbe M, et al. Cardio-Metabolic Risk Profile of a Diabetic Population in the Ho Municipality. Asian J Res Rep Endocrinol. 2018;1(1):10–20.
Cusick M, Meleth AD, Agrón E, Fisher MR, Reed GF, Knatterud GL, et al. Associations of Mortality and Diabetes complications in patients with type 1 and type 2 diabetes: Early Treatment Diabetic Retinopathy Study report 27. Diabetes Care. 2005;28(3):617–25.
Huang D, He D, Gong L, Wang W, Yang L, Zhang Z, et al. Clinical characteristics and risk factors associated with mortality in patients with severe community-acquired pneumonia and type 2 diabetes mellitus. Crit Care. 2021;25(1):419.
Laurberg T, Witte DR, Gudbjörnsdottir S, Eliasson B, Bjerg L. Diabetes-related risk factors and survival among individuals with type 2 diabetes and breast, lung, colorectal, or prostate cancer. Sci Rep. 2024;14(1):10956.
Katsiki N, Banach M, Mikhailidis DP. Is type 2 diabetes mellitus a coronary heart disease equivalent or not? Do not just enjoy the debate and forget the patient! Arch Med Sci. 2019;15(6):1357–64.
Riihimaa P. Impact of machine learning and feature selection on type 2 diabetes risk prediction. J Med Artif Intell. 2020;3.
Ho Teaching Hospital. Ho Teaching Hospital [Internet]. 2022. Available from: https://www.hth.gov.gh/
Cochran WG. Sampling techniques. 3rd ed. New York: Wiley; 1977.
Zhang A, Lipton ZC, Li M, Smola AJ. Dive into deep learning. Cambridge University Press; 2023.
Maniruzzaman M, Kumar N, Menhazul Abedin M, Shaykhul Islam M, Suri HS, El-Baz AS, et al. Comparative approaches for classification of diabetes mellitus data: machine learning paradigm. Comput Methods Programs Biomed. 2017;152:23–34.
Muhammad LJ, Algehyne EA, Usman SS. Predictive supervised machine learning models for diabetes mellitus. SN Comput Sci. 2020;1(5):240.
Chen T, He T, Benesty M, Khotilovich V, Package. ‘xgboost ’ R Version. 2019;90(1–66):40.
Rao Kondapally Seshasai S, Kaptoge S, Thompson A, Di Angelantonio E, Gao P, Sarwar N, et al. Diabetes mellitus, fasting glucose, and risk of cause-specific death. N Engl J Med. 2011;364(9):829–41.
Asante V, Gariba BB, Appiah-Brempong E, Sarpong HL, FACTORS INFLUENCING THE HEALTH SEEKING BEHAVIOUR OF PERSONS WHO HAVE DIABETES IN THE KUMASI METROPOLIS. Ghana J Sci. 2015;64:25.
Amponsah Kodom M. Health-seeking behavior of Diabetic and Hypertensive patients in Rural communities of Ghana. AJHES. 2022;1(2):1–16.
Korsah KA, Mensah GP, Achempim-Ansong G. The influence of Social meanings on Treatment seeking behaviours of patients with type 2 diabetes Mellitus: a qualitative Enquiry in a Ghanaian Hospital. J Med. 2022;3(6):1052.
Reinhard H, Lajer M, Gall MA, Tarnow L, Parving HH, Rasmussen LM, et al. Osteoprotegerin and mortality in type 2 diabetic patients. Diabetes Care. 2010;33(12):2561–6.
Mulnier HE, Seaman HE, Raleigh VS, Soedamah-Muthu SS, Colhoun HM, Lawrenson RA. Mortality in people with type 2 diabetes in the UK. Diabet Med. 2006;23(5):516–21.
Ang YG, Heng BH, Saxena N, Liew STA, Chong PN. Annual all-cause mortality rate for patients with diabetic kidney disease in Singapore. J Clin Transl Endocrinol. 2016;4:1–6.
Yoo H, Choo E, Lee S. Study of hospitalization and mortality in Korean diabetic patients using the diabetes complications severity index. BMC Endocr Disorders. 2020;20(1):122.
González-Pérez A, Saez M, Vizcaya D, Lind M, Garcia Rodriguez L. Incidence and risk factors for mortality and end-stage renal disease in people with type 2 diabetes and diabetic kidney disease: a population-based cohort study in the UK. BMJ Open Diabetes Res Care. 2021;9(1).
Hsieh MS, Hu SY, How CK, Seak CJ, Hsieh VCR, Lin JW, et al. Hospital outcomes and cumulative burden from complications in type 2 diabetic sepsis patients: a cohort study using administrative and hospital-based databases. Ther Adv Endocrinol Metab. 2019;10:2042018819875406.
Magliano DJ, Harding JL, Cohen K, Huxley RR, Davis WA, Shaw JE. Excess risk of dying from infectious causes in those with type 1 and type 2 diabetes. Diabetes Care. 2015;38(7):1274–80.
Kvistholm Jensen A, Nielsen EM, Björkman JT, Jensen T, Müller L, Persson S, et al. Whole-genome sequencing used to investigate a nationwide outbreak of Listeriosis caused by ready-to-eat Delicatessen Meat, Denmark, 2014. Clin Infect Dis. 2016;63(1):64–70.
Linkeviciute-Ulinskiene D, Kaceniene A, Dulskas A, Patasius A, Zabuliene L, Smailyte G. Increased mortality risk in people with type 2 diabetes Mellitus in Lithuania. Int J Environ Res Public Health. 2020;17:18.
Afkarian M, Sachs MC, Kestenbaum B, Hirsch IB, Tuttle KR, Himmelfarb J, et al. Kidney disease and increased mortality risk in type 2 diabetes. J Am Soc Nephrol. 2013;24(2):302–8.
Lee S, Zhou J, Leung KSK, Wu WKK, Wong WT, Liu T et al. Development of a predictive risk model for all-cause mortality in patients with diabetes in Hong Kong. BMJ Open Diabetes Res Care. 2021;9(1).
Huang AA, Huang SY. Use of machine learning to identify risk factors for coronary artery disease. PLoS ONE. 2023;18(4):e0284103.
Yue C, Xin L, Kewen X, Chang S, WLS-SVM. An Intelligent Diagnosis to Type 2 Diabetes Based on QPSO Algorithm and. 2008 International Symposium on Intelligent Information Technology Application Workshops. 2008;117–21.
Georga EI, Protopappas VC, Ardigo D, Marina M, Zavaroni I, Polyzos D, et al. Multivariate prediction of subcutaneous glucose concentration in type 1 diabetes patients based on support vector regression. IEEE J Biomed Health Inf. 2013;17(1):71–81.
Chiu SYH, Chen YI, Lu JR, Ng SC, Chen CH. Developing a prediction model for 7-Year and 10-Year all-cause mortality risk in type 2 diabetes using a hospital-based prospective cohort study. J Clin Med. 2021;10(20).
Barsasella D, Gupta S, Malwade S, Aminin, Susanti Y, Tirmadi B, et al. Predicting length of stay and mortality among hospitalized patients with type 2 diabetes mellitus and hypertension. Int J Med Inf. 2021;154:104569.
Huang AA, Huang SY. Increasing transparency in machine learning through bootstrap simulation and shapely additive explanations. PLoS ONE. 2023;18(2):e0281922.
Acknowledgements
The authors wish to acknowledge the staff of Ho Teaching Hospital, especially Mr. Clement Dason and Mr. Benjamin Amedume for their efforts and assistance during data retrieval.
Funding
No funding was received for this study.
Author information
Authors and Affiliations
Contributions
GEK was supervised by SADO as he conceptualized, developed, gathered, and analyzed the data. GEK wrote the first draft of the manuscript with assistance from SLY. High level review of the manuscript was provided by SLY and SADO. The final paper was approved by all authors.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
This study’s protocol obtained ethical approval from the Research and Ethics Committee of the Ho Teaching Hospital, identified by Protocol ID No: HTH-REC [20] FC_2022. The Institutional Research and Ethics Committee at Ho Teaching Hospital waived the need for informed consent because the study was retrospective. To ensure confidentiality, all gathered data was coded and securely stored in a separate room before computer entry. Identifiers like names and unique card numbers were omitted from the data collection process. Patient information taken from cards was treated with the utmost confidentiality. Permission was also, granted by the facility’s records department prior to commencing data collection. All research procedures were performed in accordance with relevant guidelines and regulations.
Consent for publication
Not Applicable.
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Kpene, G.E., Lokpo, S.Y. & Darfour-Oduro, S.A. Predictive models and determinants of mortality among T2DM patients in a tertiary hospital in Ghana, how do machine learning techniques perform?. BMC Endocr Disord 25, 9 (2025). https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12902-025-01831-5
Received:
Accepted:
Published:
DOI: https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12902-025-01831-5