Skip to content

GboyeStack-Robotics-ML-Engineer/SPAM-DETECTION

Repository files navigation

Spam Detection System

Overview

The Spam Detection System is a machine learning-based application designed to identify and filter spam messages. Leveraging PyTorch and PyTorch Lightning, the system provides an efficient pipeline for training, evaluating, and deploying spam detection models. The project includes a user-friendly Streamlit interface for seamless interaction, making it accessible for developers and non-technical users alike.


Features

  • Custom Data Handling: A tailored PyTorch data generator and DataModule for preprocessing datasets and ensuring compatibility with machine learning pipelines.
  • Distributed Training: Optimized for scalable training and inference using PyTorch Lightning.
  • Interactive Interface: A Streamlit-based app for easy data uploads, training configuration, and real-time visualizations of model performance.
  • High Accuracy: Hyperparameter optimization and robust evaluation metrics ensure reliable spam detection.
  • Modular Design: The project structure supports customization and future enhancements.

Installation

  1. Clone this repository:
    git clone https://github.com/YourUsername/spam-detection-system.git
    cd spam-detection-system
  2. Create a virtual environment and activate it:
    python -m venv venv  
    source venv/bin/activate  # On Windows: venv\Scripts\activate
  3. Install the required dependencies:
    pip install -r requirements.txt

Usage

  1. Prepare Your Data

    • Upload a labeled dataset in CSV format. Ensure it contains message texts and corresponding labels.
  2. Run the Streamlit App

    • Launch the app for training and inference:
      streamlit run Main.py
    • Use the app to configure hyperparameters, monitor training progress, and evaluate results.
  3. Train the Model

    • Use the app to train the model with your dataset, fine-tune hyperparameters, and observe performance metrics.
  4. Model Inference

    • Test the trained model using sample inputs to predict spam or legitimate messages.

Project Structure

  • DataGenerator.py: Custom data loader to tokenize messages and generate input features for the model.
  • prepare_data_module.py: PyTorch Lightning DataModule for handling training, validation, and test datasets.
  • Home.py: Streamlit app homepage with project description and features.

Technologies Used

  • Programming Languages: Python
  • Frameworks: PyTorch, PyTorch Lightning, Streamlit
  • Visualization Tools: TensorBoard, Streamlit

Project Demo

Screenshot (121) Link : https://mini-spam-detector-analyser.streamlit.app

Future Enhancements

  • Integrate more advanced NLP models such as transformers for enhanced performance.
  • Extend the app for multi-language spam detection.
  • Deploy the application as a web service for broader accessibility.

Contributing

Contributions are welcome! If you'd like to contribute, please fork the repository, create a feature branch, and submit a pull request.


License

This project is licensed under the MIT License. See the LICENSE file for details.


Acknowledgments

Special thanks to all open-source contributors whose libraries and frameworks made this project possible.


About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages