Background: In the digital era when mHealth has emerged as an important venue for health care, the application of computer science, such as machine learning, has proven to be a powerful tool for health care in detecting or predicting various medical conditions by providing improved accuracy over conventional statistical or expert-based systems. Symptoms are often indicators for abnormal changes in body functioning due to illness or side effects from medical treatment. Real-time symptom report refers to the report of symptoms that patients are experiencing at the time of reporting. The use of machine learning integrating real-time patient-centered symptom report and real-time clinical analytics to develop real-time precision prediction may improve early detection of lymphedema and long term clinical decision support for breast cancer survivors who face lifelong risk of lymphedema. Lymphedema, which is associated with more than 20 distressing symptoms, is one of the most distressing and dreaded late adverse effects from breast cancer treatment. Currently there is no cure for lymphedema, but early detection can help patients to receive timely intervention to effectively manage lymphedema. Because lymphedema can occur immediately after cancer surgery or as late as 20 years after surgery, real-time detection of lymphedema using machine learning is paramount to achieve timely detection that can reduce the risk of lymphedema progression to chronic or severe stages. This study appraised the accuracy, sensitivity, and specificity to detect lymphedema status using machine learning algorithms based on real-time symptom report.
Methods: A web-based study was conducted to collect patients' real-time report of symptoms using a mHealth system. Data regarding demographic and clinical information, lymphedema status, and symptom features were collected. A total of 355 patients from 45 states in the US completed the study. Statistical and machine learning procedures were performed for data analysis. The performance of five renowned classification algorithms of machine learning were compared: Decision Tree of C4.5, Decision Tree of C5.0, gradient boosting model (GBM), artificial neural network (ANN), and support vector machine (SVM). Each classification algorithm has certain user-definable hyper parameters. Five-fold cross validation was used to optimize these hyper parameters and to choose the parameters that led to the highest average cross validation accuracy.
Results: Using machine leaning procedures comparing different algorithms is feasible. The ANN achieved the best performance for detecting lymphedema with accuracy of 93.75%, sensitivity of 95.65%, and specificity of 91.03%.
Conclusions: A well-trained ANN classifier using real-time symptom report can provide highly accurate detection of lymphedema. Such detection accuracy is significantly higher than that achievable by current and often used clinical methods such as bio-impedance analysis. Use of a well-trained classification algorithm to detect lymphedema based on symptom features is a highly promising tool that may improve lymphedema outcomes.