This project implements a machine learning pipeline to predict forest fire occurrence in two Algerian regions using meteorological and Fire Weather Index (FWI) data. It demonstrates the full data science workflow—from exploration and feature engineering to model training, evaluation, and deployment as a web application.
Forest fires are a major threat to the environment, ecology, and human livelihoods. Leveraging meteorological and environmental factors, this project applies supervised machine learning—specifically Ridge Regression—to forecast the likelihood of fire events. The end product is a web application where users can input relevant weather data to obtain predictions.
The data set comprises 244 daily observations (June–September 2012) from Bejaia and Sidi Bel-Abbes regions in Algeria.
- Source: UCI Algerian Forest Fires Dataset
- Instances: 244 (122 per region)
- Target Classes:
Fire
,Not Fire
- Attributes: 11 features + 1 target (see below)
Feature | Description |
---|---|
Temperature | Maximum daily temperature (°C) |
RH | Relative Humidity (%) |
Ws | Wind Speed (km/h) |
Rain | Rainfall (mm) |
FFMC | Fine Fuel Moisture Code |
DMC | Duff Moisture Code |
DC | Drought Code |
ISI | Initial Spread Index |
BUI | Buildup Index |
FWI | Fire Weather Index |
Region | 0: Bejaia, 1: Sidi Bel-Abbes |
Classes | 1: Fire, 0: Not Fire (project encoding) |
- Exploratory Data Analysis: Outlier removal, feature relationships, class distribution.
- Feature Engineering: Correlation analysis revealed strong predictors (e.g., FFMC, Temperature, RH).
- Preprocessing: Standard scaling applied to numeric inputs.
- Model Training: Ridge regression chosen for its regularization strengths.
- Deployment: Saved model and scaler used in a Flask application for real-time user predictions.
- Accuracy: ~86–90% for most classification models; Ridge Regression chosen for balance of performance and simplicity.
- Key Predictors:
- FFMC and Temperature (positive correlation with fires)
- Relative Humidity and Rain (negative correlation with fires)
- Temperature and FFMC indices are the most indicative of fire risk.
- Rainfall and Relative Humidity serve as protective factors.
- Machine learning can enable rapid, data-driven risk assessment for early warning.
- Python: Core language
- pandas, numpy: Data handling
- scikit-learn: Machine Learning
- Flask: Web framework
- HTML/CSS: Frontend templating
- Wildfire Early Warning: For government and forestry services
- Resource Management: Firefighting resource allocation
- Climate Risk Evaluation: Policy and prevention planning