This project is a machine learning-based classifier designed to analyze and categorize cosmic data. It leverages the CatBoost classifier along with various preprocessing techniques to ensure robust and accurate classification.
-
Ayush Saksena (cogni2047191)
-
Prince Raj (cogni2047190)
-
Tanishka Nibariya (cogni2047075)
-
Ratan Jyoti Jaiswal (cogni2047342)
-
Data preprocessing using pandas, numpy, sklearn
-
Robust scaling and imputation techniques
-
CatBoost for efficient and high-performance classification
-
Performance evaluation using accuracy metrics
All necessary libraries are already mentioned in the Python Notebook. No Need to install additional dependencies.
-
Open Google Colab and upload the Python Notebook.
Goto File -> Upload Notebook -> Upload.
-
Upload Train and Test dataset from the Github Repository to the Session Storage manually.
Goto File on sidebar -> Upload -> Upload Train and Test dataset files from Github Repository.
-
Run through the cells to train and evaluate the model.
The dataset files must be uploaded manually each session via the file upload dialog.
Ensure the files are named exactly as in the github repository before loading them into pandas.
The model's performance is evaluated based on accuracy and other metrics, with visualizations powered by matplotlib and seaborn.
The Google Collab Notebook will prompt to download Submissions.csv file towards the end of the execution. The same csv file is also provived in the repository.
Feel free to fork the repository and submit pull requests to improve the classifier.