Large-Language Model (LLM) part of Talk2PowerSystem (Talk2PowerSystem_LLM) is a core component of the Talk2PowerSystem project, providing all the necessary coding and scripting to support the integration and operation of a Large-Language Model (LLM). This project focuses on enabling robust LLM functionalities, including data preprocessing, model training, inference, and seamless integration with other parts of the Talk2PowerSystem ecosystem.
-
Data Preprocessing: Scripts to clean, normalize, and format data for LLM training.
-
Model Training: Pipelines and utilities for fine-tuning and training LLM models.
-
Inference Engine: Code for running real-time queries and generating model predictions.
-
System Integration: Tools and interfaces to connect the LLM with other components of the Talk2PowerSystem project.
-
Testing and Evaluation: Automated tests and performance evaluation scripts to ensure model reliability and accuracy.
The repository is organized as follows:
-
src/
- Main source code including training, inference, and integration scripts. -
data/
- Data sets and preprocessing scripts. -
docs/
- Documentation, guides, and technical notes. -
tests/
- Unit and integration tests for various modules. -
config/
- Configuration files for model parameters and environment settings. -
evaluation_results/
- Directory, which holds the evaluation results of the system
-
You should install [conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/index.html).
miniconda
will suffice.
To set up the project locally, follow these steps:
-
Clone the repository:
git clone https://github.com/statnett/Talk2PowerSystem_LLM.git
-
Create a conda environment and install dependencies
conda create --name Talk2PowerSystemLLM --file conda-linux-64.lock conda activate Talk2PowerSystemLLM poetry install
conda activate Talk2PowerSystemLLM
poetry install --with test
poetry run pytest --cov=talk2powersystemllm --cov-report=term-missing tests/unit_tests/