Data-Science-Project-on-Obesity-Dataset

This is a Data Science project for my Data Science for Software Engineers (Course Code 544) class, where I achieved a grade of A. I used machine learning models to predict the potential level of obesity of a patient. I identified the best models and parameters to achieve the highest accuracy in predictions. Obesity Risk Dataset contains information about the key attributes of individuals such as their age, gender, diet, and level of activity, and the level of obesity. In total there are 16 features excluding the ID number, and the target vector is the level of obesity of the individual. The goal of this project is to use these attributes to estimate an individual's risk of obesity.

Three models used for comparison was LogisticRegression, KNeighborsClassifier and SVC. The best performing model for this dataset was SVC with a training score of 0.87 and a validation score of 0.88. SVC performed the best due to high dimensionality and large feature count. The optimal parameters for SVC were svc_c = 100 and svc_gamma = 0.01.

The dataset used can be found at Obesity Risk Dataset on Kaggle.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
ENSF544_Project.ipynb		ENSF544_Project.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Data-Science-Project-on-Obesity-Dataset

About

Uh oh!

Releases

Packages

Languages

Alexeygrekov/Data-Science-Project-on-Obesity-Dataset

Folders and files

Latest commit

History

Repository files navigation

Data-Science-Project-on-Obesity-Dataset

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages