A Novel Hyperparameter Search Approach for Accuracy and Simplicity in Disease Prediction Risk Scoring

Abstract

Disease risk scores are vital for patient stratification and resource allocation. Existing scoring approaches often complicate score scales. To overcome this, we introduce a novel hyperparameter search technique to simplify the scores while maintaining high accuracy. Materials and Methods The risk scores, generated by our proposed technique in conjunction with different predictor discretization methods, were applied to case studies of predicting diabetic retinopathy (DR) and hip fracture readmission (HFR) for evaluation. Our cohorts involved 97,876 diabetic patients, including 3,749 with DR, and 18,065 hip fracture patients, with 2,055 readmitted within 30 days. Results Our scores achieve accuracies insignificantly different from those obtained by existing approaches, up to an AUC of 0.814 (varied depended on the predictor discretization methods used) for DR prediction and up to 0.638 for HFR prediction. Regarding the scale, our scores range from 0 to a maximum of 42 for DR and 0 to a maximum of 10 for HFR, while the risk scores produced by other methods often span till hundreds or thousands. Discussion The case studies focusing on DR and HFR predictions demonstrate that our approach can potentially serve as a general framework for developing simpler and accurate risk scores for disease prediction. Furthermore, our new DR risk score system can be a competitive alternative to the state-of-the-art DR risk score. While our HFR case study presents the first risk score for this condition. Conclusion Our novel hyperparameter search approach yields simple and accurate risk scores, fostering ease of use for healthcare professionals in risk stratification.

Publication
Journal of the American Medical Informatics Association, 31(8):1763–1773
Yajun Lu
Yajun Lu
Assistant Professor of Analytics & Operations Management

My research interests are broadly in Business Analytics, Healthcare Analytics and Operations, Graph-based Data Mining, Social Media Analytics, Network Optimization, and Artificial Intelligence.