Skip to main content

Development and validation of a nomogram for screening patients with type 2 diabetic ketoacidosis

Abstract

Objective and background

The early detection of diabetic ketoacidosis (DKA) in patients with type 2 diabetes (T2D) plays a crucial role in enhancing outcomes. We developed a nomogram prediction model for screening DKA in T2D patients. At the same time, the input variables were adjusted to reduce misdiagnosis.

Methods

We obtained data on T2D patients from Mimic-IV V0.4 and Mimic-III V1.4 databases. A nomogram model was developed using the training data set, internally validated, subjected to sensitivity analysis, and further externally validated with data from T2D patients in Aviation General Hospital.

Results

Based on the established model, we analyzed 1885 type 2 diabetes patients, among whom 614 with DKA. We further additionally identified risk factors for DKA based on literature reports and multivariate analysis. We identified age, glucose, chloride, calcium, and urea nitrogen as predictors in our model. The logistic regression model demonstrated an area under the curve (AUC) of 0.86 (95%CI: 0.85–0.90]. To validate the model, we collected data from 91 T2D patients, including 15 with DKA, at our hospital. The external validation of the model yielded an AUC of 0.68 (95%CI: 0.67–0.70). The calibration plot confirmed that our model was adequate for predicting patients with DKA. The decision-curve analysis revealed that our model offered net benefits for clinical use.

Conclusions

Our model offers a convenient and accurate tool for predicting whether DKA is present. Excluding input variables that may potentially hinder patient compliance increases the practical application significance of our model.

Peer Review reports

Introduction

Diabetes is a metabolic disorder impacted by both environmental and genetic factors related to insulin insensitivity and deficiency, as well as impaired biological function. With the global population experiencing growth and aging trends, the prevalence of diabetes in adults has increased nearly fourfold from 1980 to 2014 [1]. Diabetic ketoacidosis (DKA) is a life-threatening hyperglycemia emergency in diabetes. Although DKA is predominantly associated with type I diabetes, it manifests in type 2 diabetes (T2D) patients [2] at a low rate. With the population of T2D patients increasing, the absolute number of patients with DKA also increases. Thus, early detection of DKA in T2D patients is crucial for improving patient prognosis and reducing medical costs.

Although it is allowed to make an early diagnosis according to the guidelines for adult ketoacidosis, some patients still have delayed diagnosis and treatment. The possible causes are several factors. First, infection is a common cause of DKA in T2D [3], and patients often visit non-endocrine departments due to the presence of an infection. The examination of the ketone body and blood gas can be ignored, leading to a delay in diagnosis and treatment [4]. Second, the application of sodium-glucose co-transporter-2 inhibitors can lead to DKA with normal blood glucose, which adds difficulty to the diagnosis [5]. When the diagnosis of DKA is not considered by physicians, ketone bodies and blood gas tests are omitted, leading to misdiagnosis. Third, a proportion of DKA patients are newly diagnosed with diabetes, and doctors may only perform routine examinations at the time of initial diagnosis, which also brings difficulties in finding the diagnosis of DKA [6]. Fourth, the untimely or inaccurate detection of ketone body and blood gas analysis is also the reason for the delayed diagnosis of DKA [7, 8].

It is of significant clinical importance to utilize effective predictive models for early diagnosis of DKA and prevent misdiagnosis attributed to the aforementioned factors. Currently, no studies have reported a predictive model specifically designed for screening DKA in T2D patients. In our study, we have developed novel nomograms for identifying DKA in individuals with T2D. Additionally, considering the potential causes of delayed diagnosis and treatment mentioned above, we have refined the model's clinical applicability by excluding indicators related to blood and urine ketone bodies as well as blood gas analysis from the input variables, focusing solely on commonly employed outpatient examinations during model establishment. This approach has yielded satisfactory outcomes.

Methods

Data source

The data for this study was collected from the MIMIC-III and MIMIC-IV databases of the Medical Information Mart for Intensive Care. The former is an integrated, de-identified, comprehensive clinical dataset of all patients who were admitted to the ICU of Beth Israel Deaconess Medical Center in Boston, MA, from June 1, 2001, to October 31, 2012, including 53,423 distinct hospital admissions for adult patients (aged > 16 years). The latter is a publicly available real-world clinical database maintained by Beth Israel Deaconess Medical Center from 2008 to 2019, including over 200,000 admissions to the emergency department and over 60,000 ICU stays. At the same time, 91 hospitalized patients and outpatients with T2D in Aviation General Hospital from August 2015 to August 2021 were also included. This retrospective study was executed in compliance with the Declaration of Helsinki and approved by the Ethics Committee of Aviation General Hospital. Since it was a de-identified study, signing the informed consent was not required. The study followed the transparent reporting of a multivariable prediction model for Individual prognosis or diagnosis (TRIPOD) statement recommendations [9]. Su Bo, the author, was granted access to the database to collect data for research purposes (certification number: 10221423).

Study population

Our study included T2D patients with and without DKA who met the inclusion criteria: (i) the diagnosis of T2D met the diagnostic criteria set by the American Diabetes Association in 2014 [10, 11]; ( ii) the diagnosis of DKA in adults met the diagnostic criteria set by the Joint British Diabetes Society for Inpatient Care [12]; and (iii) only data from the first admission were analyzed for patients who were hospitalized multiple times. Because missing data is common in MIMIC-IV and MIMIC-III databases, we first removed covariables with more missing values. We further excluded non-first-visit patients, which resulted in the deletion of 5718 patients in the MIMIC database and 12 in our database. We then removed cases with missing data greater than 20% of the observations, which led to the removal of 9804 patients in the MIMIC database and 39 patients in our database (Fig. 1). As this study was a hypothesis-generating epidemiological study, to maximize the statistical power of the predictive model, we included all eligible patients in the database without estimating the sample size. We used multiple interpolations for missing data. The outcome indicator of the model was the diagnosis of ketoacidosis.

Fig. 1
figure 1

Flow chart

Data collection

The raw data was retrieved using Navicat Premium (version 12.0.28) and pgAdmin PostgreSQL tools (version 1.22.1), with the keywords "type 2 diabetes mellitus," "ketoacidosis," and "diabetes, type 2 on discharge." Subsequently, the data forms were merged using Stata software, and demographic data such as ethnicity, age, admission type, gender, serum creatinine, glucose, sodium, potassium chloride calcium, and urea nitrogen were extracted. To enhance the clinical application significance of the model, all laboratory test results were included (refer to Tables 1 and  2).

Table 1 Clinical and demographic characteristics between Non-ketoacidosis group and ketoacidosis group(mimic database)
Table 2 Clinical and demographic characteristics betweenNon-ketoacidosis group and ketoacidosis group(Aviation General Hospital)

Statistical analysis

We compared the clinical characteristics between the DKA and non-DKA groups using appropriate statistical tests, such as the student t-test and rank-sum test, and the categorical variables using Fisher’s exact test. In the latter, we used two steps to filter input variables: First, the independent variables with p < 0.05 were screened by multivariable logistic regression, and maximum likelihood estimation was used for extrapolating new data. Second, we conducted a literature analysis on the screened variables reported in the literature and selected meaningful variables as input variables. A parsimonious approach was used in constructing the original nomogram model for predicting the incidence of DKA in T2D patients using the training set, with the consideration of containing as few variables as possible in clinical practice [13]. Variables that are less tested in the outpatient clinic or may decrease patient compliance, such as blood and urine ketone bodies, pH, and bicarbonate, were excluded. We evaluated the nomogram model’s performance by calculating the areas under the receiver operating characteristic curve (ROC) and validated the model using external data from our hospital. Additionally, we applied the decision-curve analysis to examine the net benefits of the model.

Statistical analyses were conducted using R software (version 3.6.1, R Foundation for Statistical Computing, Vienna, Austria). R packages used: tidyverse, survival,rms,nomogramFormula, DynNom.To handle missing data, we used multiple imputations. A two-sided p-value of < 0.05 was set as the threshold for statistical significance.

Results

Participants

Tables 1 and 2 describe the differences in characteristics between the DKA and non-DKA groups. Patients at admission were younger in the DKA group [66.70(58–77) vs. 57.28(47–66), p < 0.001]. Glucose, potassium, and serum creatinine levels were lower in the non-DKA group [134(107–180) vs. 257(166–370), p < 0.001, 4.1(3.8–4.5) vs. 4.2(3.8–4.8), p < 0.001, and 1(0.8–1.7) vs. 1.1(0.8–1.8), p = 0.037, respectively]. By contrast, calcium level was higher in the non-DKA group [8.6(8.2–9.1) vs. 8.3(7.49–8.96), p < 0.001).

Logistic regression variable screening and nomogram development

The results of the logistic regression variable screening for DKA of T2D patients are presented in Table 3, which lists the risk factors related to morbidity. All statistically significant variables were found to be related to the incidence of DKA. Finally, we selected the age at admission in years, glucose, chloride, calcium, and urea nitrogen as the input variables to develop the nomogram model.

Table 3 Multivariable logistic regression analysis(mimic database)

Performance of the nomogram model

Multivariate analysis was conducted to identify the variables that could predict DKA in T2D patients, resulting in the selection of five variables, including: age, glucose, chloride, calcium, and urea nitrogen. These variables were used to create an intuitive nomogram model (Fig. 2). The model’s discrimination accuracy was evaluated using a C-index based on the area under the curve (AUC) of the ROC. The C-index threshold was set above 0.7 and reached 0.86 (95%CI: 0.836 − 0.885), indicating a high level of accuracy. An external validation cohort from our medical center was used to assess the model’s feasibility in other populations, which resulted in a promising AUC of 0.69 (95% CI: 0.68 − 0.736) (Fig. 3). Furthermore, two calibration curves displaying the diagnosis of DKA in T2D patients were presented to evaluate the model’s utility (Fig. 4), which indicated a favorable agreement in both the training cohort and the external cohort.

Fig. 2
figure 2

Nomogram for predicting patients with type 2 diabetic ketoacidosis. When using it, draw a vertical line from each variable upward to the points and then record the corresponding points (“age = 50” = 20 points). The point of each variable was then summed up to obtain a total score that corresponds to a predicted probability at the bottom of the nomogram

Fig. 3
figure 3

ROC curves drawn in the internal validation (A1) and the external validation (A2). Redpoint: The optimal cut-off point of the ROC curve

Fig. 4
figure 4

Calibration curves drawn in the internal validation (A1) and the external validation (A2)

The nomogram model’s clinical use

The nomogram model’s clinical benefits were assessed using the decision curve analysis. In both external and internal validation sets, interventions with probability thresholds between 0.2 and 0.4 could result in better prognoses based on the model (Fig. 5).

Fig. 5
figure 5

(A2) Decision curve analysis DCA of the nomogram to predict the probability of diabetic ketoacidosis in the training cohort. (A1) DCA of the nomogram to predict the probability of diabetic ketoacidosis in the validation cohort

Discussion

The development of diabetic ketoacidosis (DKA) is an acute complication observed in patients with type 2 diabetes (T2D), which can lead to severe disturbances in electrolyte balance, dehydration, and multiple organ failure, posing a life-threatening risk. The early detection plays a crucial role in enhancing prognosis and reducing medical expenses. The current diagnostic criteria for DKA in adults recommended by the American Diabetes Association are glucose > 250 mg/dL, arterial or venous pH < 7.3, bicarbonate < 10 mmol/L, urine or serum ketones positive, β-hydroxy butyrate > 3.0 mmol/L, anion gap > 10, and mental status alert [14]. If we were able to acquire all the data encompassed in the diagnostic criteria, the diagnosis of DKA in clinical practice would become more facile. Even without resorting to predictive models, clinicians can accurately diagnose at an early stage. Taking our study's data as an exemplification, if a prediction model were established based on the variables within the diagnostic criteria, both internal validation AUC and external validation achieved remarkable results of 98% and 96%, respectively.

However, despite the validity of the diagnostic criteria, there still exists a delay in diagnosing and treating DKA patients in clinical practice. This can be attributed to several factors. Firstly, even in developed countries with a high prevalence of diabetes, there is a lack of thorough examination for urine or serum ketones, bicarbonate levels, and pH values. A study conducted in Britain revealed that only 36% and 34% of individuals underwent urine ketone and blood ketone examinations respectively [7]. Second, the results of pH and bicarbonate tests are not reliable [8]. Third, in the predisposing factors of DKA, it has been reported that infection and initial diabetes were related [3, 15]. Patients with infections and newly diagnosed diabetes often visit non-endocrine departments such as surgery and infectious diseases. However, non-endocrine departments will ignore some specialized tests, such as blood and urine ketone bodies, pH, and bicarbonate [4], leading to a delayed diagnosis and treatment. Fourth, the decrease in compliance is undoubtedly also the cause of delayed diagnosis and treatment of DKA. However, patients with low socioeconomic status or unemployed can lead to a decrease in the proportion of outpatient visits [16]. So we need programs that minimize outpatient examinations. Fifth, there are many influencing factors for mental status alerts, and there are large errors in doctors' discrimination.

In this study, we thoroughly considered the factors contributing to delayed diagnosis and treatment of DKA. During the selection process of input variables, we excluded those that may be overlooked by outpatient physicians, potentially leading to reduced patient compliance and yielding unstable results. These excluded variables encompassed urine or serum ketones, bicarbonate levels, pH values, β-hydroxy butyrate levels, anion gap measurements, mental status alertness assessments, islet function evaluations, and glycosylated hemoglobin tests. Instead, we opted for input variables that have been previously associated with DKA onset in the existing literature and demonstrated statistical significance through our multiple-factor logistic analysis. The selected factors included age, glucose levels, chloride concentrations calcium levels, and urea nitrogen levels. Although there was a slight decrease in predictive power, the internal validation yielded an AUC of 86%, while the external validation resulted in 68%. By adjusting the probability threshold, this model can serve as a screening tool within outpatient settings.

Establishing a clinical prediction model needs to consider patients’ clinical benefits. In our study, the decision curve was drawn. The curve showed that the threshold was 0.2–0.8 in the internal validation and 0.1–0.4 in the external validation. Because the decision curve only provides a recommendation of the probability threshold, we have to choose the probability threshold based on our model’s specific application value [17]. In our study, our prediction model is a preliminary screening model, which needs to be maximized to avoid missed diagnoses. Although it will increase misdiagnosis, it can also provide good clinical significance as a screening model. At the same time, adjusting the probability threshold can increase the model’s generalization ability [18]. The above methods make up for the shortcomings of unsatisfactory AUC values in the external validation of our prediction model.

Previously, Xie W and Shi J employed machine learning models and nomograms to generate insightful predictions regarding the mortality rate and duration of hospitalization for patients in intensive care units [19, 20]. Although the populations and conclusions examined in the aforementioned literature differ from those in this thesis, the research methodologies employed are similar, thereby validating the applicability of the research approach adopted in this study. Jiang Y and colleagues have utilized nomogram charts to prognosticate high-risk cohorts susceptible to diabetic ketoacidosis among recently diagnosed patients with type 2 diabetes [21]. Although commendable results were obtained, it is important to note that this study is limited by its single-center design, small sample size, and absence of external validation. Qi M et al. employed logistic regression analysis to accurately predict and promptly identify patients with diabetic ketoacidosis [22]. However, the lack of external verification, limited sample size, and poor model performance in this study hinder its potential for widespread adoption and application. In contrast, our study addresses these limitations by integrating the strengths and weaknesses of previous research, expanding the sample size, and conducting rigorous external validation work. Consequently, our findings hold greater clinical significance.

Our study found that DKA patients in the non-T2D group were older, both in the MIMIC database and in our hospital database. The onset of DKA in atypical T2D has been reported at younger ages [23]. However, the study had a small sample size and did not analyze the reasons. It would be helpful to analyze the reasons if the patients in our hospital database were followed up.

Although our prediction model has clinical significance, the performance of external validation is poor. This may be due to the difference between the population for which the model was developed and the population for external validation. Meanwhile, the validation cohort is small (n = 91), compared to the training cohort (n = 1885). With sample size for validation, the AUC value may not be accurate. Increasing the sample size would be meaningful in improving the nomogram model’s performance.

Conclusion

The newly developed nomogram model serves as a convenient and accurate tool for predicting the presence or absence of DKA, showcasing its practical significance in clinical settings. By excluding inspection items that may compromise patient compliance or tests easily overlooked in outpatient clinics, the nomogram model's applicability is further enhanced.

Availability of data and materials

The data supporting Table 1, are publicly available, as part of this record: MIMIC-III MIMIC-IV. Data supporting Table 2, are not publicly available in order to protect patient privacy.

Abbreviations

DKA:

Dabetic ketoacidosis

MIMIC-III:

Medical Information Mart for Intensive Care

References

  1. Li Y, Teng D, Shi X, et al. Prevalence of diabetes recorded in mainland China using 2018 diagnostic criteria from the American diabetes association: national cross sectional study. BMJ. 2020;369:m997.

    Article  PubMed  PubMed Central  Google Scholar 

  2. Li Q, Lv L, Chen Y, Zhou Y. Early prediction models for prognosis of diabetic ketoacidosis in the emergency department: a protocol for systematic review and meta-analysis. Medicine (Baltimore). 2021;100(21):e26113.

    Article  PubMed  Google Scholar 

  3. Xu Y, Bai J, Wang G, et al. Clinical profile of diabetic ketoacidosis in tertiary hospitals in China: a multicentre, clinic-based study. Diabet Med. 2016;33(2):261–8.

    Article  CAS  PubMed  Google Scholar 

  4. Liu CC, Chen KR, Chen HF, et al. Association of doctor specialty with diabetic patient risk of hospitalization due to diabetic ketoacidosis: a national population-based study in Taiwan. J Eval Clin Pract. 2011;17(1):150–5.

    Article  PubMed  Google Scholar 

  5. Bonora BM, Avogaro A, Fadini GP. Sodium-glucose co-transporter-2 inhibitors and diabetic ketoacidosis: an updated review of the literature. Diabetes Obes Metab. 2018;20(1):25–33.

    Article  CAS  PubMed  Google Scholar 

  6. Lapolla A, Amaro F, Bruttomesso D, et al. Diabetic ketoacidosis: a consensus statement of the Italian Association of Medical Diabetologists (AMD), Italian Society of Diabetology (SID), Italian Society of Endocrinology and Pediatric Diabetoloy (SIEDP). Nutr Metab Cardiovasc Dis. 2020;30(10):1633–44.

    Article  CAS  PubMed  Google Scholar 

  7. Auchterlonie A, Okosieme OE. Preventing diabetic ketoacidosis: do patients adhere to sick-day rules. Clin Med (Lond). 2013;13(1):120.

    Article  PubMed  Google Scholar 

  8. Duan M, Wang W, Zhao H, et al. National surveys on internal quality control for blood gas analysis and related electrolytes in clinical laboratories of China. Clin Chem Lab Med. 2018;56(11):1886–96.

    Article  CAS  PubMed  Google Scholar 

  9. Moons KG, Altman DG, Reitsma JB, et al. Transparent reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD): explanation and elaboration. Ann Intern Med. 2015;162(1):W1-73.

    Article  PubMed  Google Scholar 

  10. Diagnosis and classification of diabetes mellitus. Diabetes Care. 2014;37(Suppl 1):S81-90.

    Google Scholar 

  11. ElSayed NA, Aleppo G, Aroda VR, et al. 2. Classification and diagnosis of diabetes: standards of care in diabetes-2023. Diabetes Care. 2023;46(Suppl 1):S19–40.

    Article  CAS  PubMed  Google Scholar 

  12. Dhatariya KK. The management of diabetic ketoacidosis in adults-An updated guideline from the Joint British Diabetes Society for Inpatient Care. Diabet Med. 2022;39(6):e14788.

  13. Ren Y, Zhang L, Xu F, et al. Risk factor analysis and nomogram for predicting in-hospital mortality in ICU patients with sepsis and lung infection. BMC Pulm Med. 2022;22(1):17.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Dhatariya KK, Glaser NS, Codner E, Umpierrez GE. Diabetic ketoacidosis. Nat Rev Dis Primers. 2020;6(1):40.

    Article  PubMed  Google Scholar 

  15. Nunes R, Mota C, Lins P, et al. Incidence, characteristics and long-term outcomes of patients with diabetic ketoacidosis: a prospective prognosis cohort study in an emergency department. Sao Paulo Med J. 2021;139(1):10–7.

    Article  PubMed  PubMed Central  Google Scholar 

  16. Brewster S, Bartholomew J, Holt R, Price H. Non-attendance at diabetes outpatient appointments: a systematic review. Diabet Med. 2020;37(9):1427–42.

    Article  CAS  PubMed  Google Scholar 

  17. Vickers AJ, van Calster B, Steyerberg EW. A simple, step-by-step guide to interpreting decision curve analysis. Diagn Progn Res. 2019;3:18.

    Article  PubMed  PubMed Central  Google Scholar 

  18. Su B. Using metabolic and biochemical indicators to predict diabetic retinopathy by back-propagation artificial neural network. Diabetes Metab Syndr Obes. 2021;14:4031–41.

    Article  PubMed  PubMed Central  Google Scholar 

  19. Xie W, Li Y, Meng X, Zhao M. Machine learning prediction models and nomogram to predict the risk of in-hospital death for severe DKA: A clinical study based on MIMIC-IV, eICU databases, and a college hospital ICU. Int J Med Inform. 2023;174:105049.

    Article  PubMed  Google Scholar 

  20. Shi J, Chen F, Zheng K, et al. Clinical nomogram prediction model to assess the risk of prolonged ICU length of stay in patients with diabetic ketoacidosis: a retrospective analysis based on the MIMIC-IV database. BMC Anesthesiol. 2024;24(1):86.

    Article  PubMed  PubMed Central  Google Scholar 

  21. Jiang Y, Zhu J, Lai X. Development and validation of a risk prediction model for ketosis-prone type 2 diabetes mellitus among patients newly diagnosed with type 2 diabetes mellitus in China. Diabetes Metab Syndr Obes. 2023;16:2491–502.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Qi M, Shao X, Li D, et al. Establishment and validation of a clinical model for predicting diabetic ketosis in patients with type 2 diabetes mellitus. Front Endocrinol (Lausanne). 2022;13:967929.

    Article  PubMed  Google Scholar 

  23. Tan H, Zhou Y, Yu Y. Characteristics of diabetic ketoacidosis in Chinese adults and adolescents – a teaching hospital-based analysis. Diabetes Res Clin Pract. 2012;97(2):306–12.

    Article  PubMed  Google Scholar 

Download references

Acknowledgements

The authors thank Li Hui for translating the paper into English.

Funding

None.

Author information

Authors and Affiliations

Authors

Contributions

Bo Su completed statistical analysis and article writing.Hui Li completed all the work of data collection.The paper was written and the creative ideas were guided by Gui Zhong Li.

Corresponding author

Correspondence to Gui Zhong Li.

Ethics declarations

Ethics approval and consent to participate

The need for written informed consent was waived by the Aviation General Hospital ethics committee due to the retrospective nature of the study. Because all patients’ identities are anonymous, signing the informed consent is not required. This retrospective study was executed in compliance with the Declaration of Helsinki and approved by the Ethics Committee of Aviation General Hospital.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, H., Su, B. & Li, G.Z. Development and validation of a nomogram for screening patients with type 2 diabetic ketoacidosis. BMC Endocr Disord 24, 148 (2024). https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12902-024-01677-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12902-024-01677-3

Keywords