COVID-19 survival analysis of a dataset and prediction using Python (sklearn, pandas, numpy, matplotlib, lifelines, mlxtend, joblib)
Covid19 pandemics has hit society in the last two years.
We have simulated a dataset of patients suffering covid that have been admitted to hospitals in the last 2 years.
The dataset contains data from patients admitted in different hospitals diagnosed with COVID-19 (age, sex, days in hospital, days in ICU, exitus, destination after being admitted in ER, and some medical parameters collected when they were firstly admitted in ER: temperature, heart rate, blood glucose, O2 saturation, systolic blood pressure, and diastolic blood pressure).
The assignment consists of two parts which must be clearly differentiated in the report that you have to elaborate:
- Project plan.
- Technical report, which must focus on the analysis of the data. In this regard, you must consider univariate and bivariate analysis, survival curves (e.g., Kaplan–Meier), and any other analysis that may help to understand the survival of a patient. You must also train and test different models to predict the survival (the most important part of the technical execution is the previous analysis though).