Two Staged Prediction of Gastric Cancer Patient’s Survival Via Machine Learning Techniques presented at CNSA 2020

by Peng Liu, Chen Bo Yu, Liuwen Li, And Shumin Fei,

Summary : Cancer is one of the most common causes of death in the world, while gastric cancer has the highest incidence in Asia. Predicting gastric cancer patients’ survivability can inform patients care decisions and help doctors prescribe personalized medicine. Classification techniques have been widely used to predict survivability of cancer patients. However, very few attention has been paid to patients who cannot survive. In this research, we consider survival prediction to be a two-staged problem. The first is to predict the patients’ five-year survivability. If the patient’s predicted outcome is death, the second stage predicts the remaining lifespan of the patient. Our research proposed a custom ensemble method which integrated multiple machine learning algorithms. It exhibits a significant predictive improvement in both stages of prediction, comparing to the state-of-the-art Machine Learning techniques. The base machine learning techniques include Decision Trees, Random Forest, Adaboost, Gradient Boost Machine(GBM), Artificial Neural Network (ANN), and the most popular GBM framework--LightGBM. The model is comprehensively evaluated on open source cancer data provided by the Surveillance, Epidemiology, and End Results Program (SEER) in terms of accuracy, area under the curve, f-score, precision, recall rate, training and predicting time in the classification stage, and Root Mean Squared Error, Mean Absolute Error, coefficient of determination (R2) in the regression stage.