Skip to content

maxikuehn/PMDS_Codons

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Synonyme Codon Prediction

Python PyTorch NumPy Pandas Matplotlib Jupyter Notebook Gitea

Project Medical Data Science - SS 2024

Insa Belter, Maximilian Kühn, Felix Mucha, Nils Rekus, Floris Wittner

Installation

Requirements

  • Python 3.8 or higher (Python 3.12 is recommended). More information on how to install Python can be found here.
  • pip Python package installer (usually included in Python installations)

Setup

  1. Clone the repository
  2. Create a virtual environment
    python -m venv .venv
  3. Activate the virtual environment
    source .venv/bin/activate
  4. Windows only: to use Cuda for NVIDIA GPU acceleration, install pytorch with the following command first (for more information see Pytorch Installation Guide)
    pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
  5. Install all other requirements
    pip install -r requirements.txt

Project Content

Folder Structure

.
├── data
│   └── organism
│       └── cleanedData.pkl
├── ml_models
│   └── organism
│       ├── best_rnn_model.pt
│       ├── best_tcn_model.pt
│       └── best_encoder_model.pt
├── notebooks
│   ├── archive
│   └── *.ipynb
├── scripts
│   ├── archive
│   └── *.py
├── unit_tests
├── README.md
└── requirements.txt

Notebooks

Scripts

Data

Contains the cleaned sequence data for 3 organisms (E.Coli, D.Melanogaster, H.Sapiens). The data is split in training, testing and validation data and then saved as a pickle file. Also all the splits are saved in a shuffled version.

The data folder also contains various files, which track the model training progress and the model performance.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published