A machine learning system for air quality index (AQI) prediction using Apache Spark MLlib with data visualization for model performance comparison.
This project uses Spark MLlib to develop predictive models for air quality index forecasting. The system processes historical AQI data along with relevant environmental factors to make predictions about future air quality levels
- Data preprocessing pipeline for AQI datasets
- Machine learning models implemented with Spark MLlib
- Visualization tools for model comparison and performance analysis
- Prediction evaluation metrics
- Apache Spark
- Spark MLlib
- Data visualization libraries
- Python
The system includes visualization components to compare different prediction models, allowing for:
- Performance comparison across multiple algorithms
- Error analysis and distribution visualization
- Feature importance evaluation
- Apache Spark
- Python 3.9+