|
1 |
| -# Docker Agent |
2 | 1 |
|
3 |
| -This Docker image is based on the Python 3 slim image and includes the transformers and dioc libraries. |
| 2 | +# Intract Code API |
4 | 3 |
|
5 |
| -## Building the Docker Image |
| 4 | +[](https://opensource.org/licenses/MIT) |
| 5 | +[](https://www.python.org/downloads/) |
6 | 6 |
|
7 |
| -To build the Docker image, navigate to the directory containing the Dockerfile and run the following command: |
| 7 | +An API designed for code completion and fine-tuning of open-source large language models on internal codebases and documents. |
8 | 8 |
|
9 |
| -```bash |
10 |
| -docker build -t docker_agent . |
11 |
| -``` |
| 9 | +## ✨ **Key Features** |
12 | 10 |
|
13 |
| -## Starting the Docker Container |
| 11 | +- 🚀 **Code Completion API**: Seamlessly integrate advanced code suggestions into your development process. |
| 12 | +- ⚙️ **Custom Fine-tuning**: Personalize models to your company's codebase and internal knowledge, including support for documents and PDFs. |
| 13 | +- 📈 **Fine-tuning Techniques**: Supports Standard, LoRA, and QLoRA fine-tuning methods. |
| 14 | +- 👥 **Multi-user Support**: Run multiple users with different models on a shared server. |
| 15 | +- 🧠 **Retrieval-Augmented Generation (RAG)**: Experimental feature enabling context-aware generation. |
14 | 16 |
|
15 |
| -To start the Docker container, run the following command: |
| 17 | +--- |
16 | 18 |
|
17 |
| -```bash |
18 |
| -docker run -p 8000:8000 -it --rm --name docker_agent docker_agent |
19 |
| -``` |
20 |
| -This will start the container and bind port 8000 on the host to port 8000 on the container. The container will be removed when it is stopped. By default the container will use the `deepseek-ai/deepseek-coder-1.3b-base` model. To use a different model, set the `MODEL_NAME` environment variable when starting the container. For example, to use the `bert-large-uncased` model, add the `-e MODEL_NAME=bert-large-uncased` flag to the `docker run` command. To use GPU acceleration, add the `--gpus all` flag to the `docker run` command. |
| 19 | +## 🚀 **Quick Start** |
21 | 20 |
|
22 |
| -## Finetune hyperparameters |
| 21 | +Get started with just a few commands: |
23 | 22 |
|
24 |
| -To change the finetune hyperparameters, the environment variables must start with `FINETUNE_`. For example, to change the number of epochs to 2, add the `-e FINETUNE_NUM_TRAIN_EPOCHS=2` flag to the `docker run` command. The parameters that can be changed are same as the `transformers.TrainingArguments` class, and model arguments for the model being finetuned. For more information, see the [documentation](https://huggingface.co/docs/transformers/v4.35.2/en/main_classes/trainer#transformers.TrainingArguments). |
| 23 | +1. **Build the Docker image:** |
| 24 | + ```bash |
| 25 | + docker build -t docker_agent . |
| 26 | + ``` |
25 | 27 |
|
26 |
| -### Finetune on project directories |
27 |
| -When an entire project data is passed, the dataset consists of a code completion task for the entire project, and multiple code insertion tasks. The code insertion tasks are randomly generated for each file in the project directory. The hyperparameters controlling the code insertion tasks are as follows: |
28 |
| -num_code_insertions_per_file |
29 |
| -- `FINETUNE_NUM_CODE_INSERTIONS_PER_FILE`: Number of code insertion tasks per file. Default: 10 |
30 |
| -- `FINETUNE_SPAN_MAX`: Range of code insertion spans. Default: 256 |
| 28 | +2. **Start the Docker container:** |
| 29 | + ```bash |
| 30 | + docker run -p 8000:8000 -it --rm --name docker_agent docker_agent |
| 31 | + ``` |
31 | 32 |
|
32 |
| -## Testing the Docker Container |
| 33 | + - Binds port `8000` on the host to port `8000` on the container. |
| 34 | + - Removes the container when it stops. |
| 35 | + - Uses the `deepseek-ai/deepseek-coder-1.3b-base` model by default. |
33 | 36 |
|
34 |
| -To test the Docker container, run the following command: |
| 37 | +3. **Access the API:** |
| 38 | + Once the container is running, you can access the API documentation and test the endpoints by opening a web browser and navigating to: |
| 39 | + ``` |
| 40 | + http://localhost:8000/docs |
| 41 | + ``` |
| 42 | + This will open the Swagger UI, where you can explore and interact with the available API endpoints. |
35 | 43 |
|
36 |
| -```bash |
37 |
| -python client/call_agent.py "<INPUT TEXT>" |
38 |
| -``` |
39 | 44 |
|
40 |
| -To run a sample finetune on project directories, run the following command: |
| 45 | +--- |
| 46 | + |
| 47 | +## 🔥 **Enable GPU Acceleration** |
| 48 | + |
| 49 | +Unlock GPU acceleration by adding the `--gpus all` flag: |
41 | 50 |
|
42 | 51 | ```bash
|
43 |
| -python client/finetune_agent.py |
| 52 | +docker run -p 8000:8000 --gpus all -it --rm --name docker_agent docker_agent |
44 | 53 | ```
|
45 | 54 |
|
46 |
| -To access the API documentation, navigate to `http://localhost:8000/docs` in your browser. |
| 55 | +--- |
| 56 | + |
| 57 | +## 🔧 **Configuration and Parameter Customization** |
| 58 | + |
| 59 | +The model's behavior and training parameters can be customized by modifying the `src/conf/config.yaml` file. Key configuration options include: |
| 60 | + |
| 61 | +### Model Configuration |
| 62 | +- `model_name`: Set the model to use (default: deepseek-ai/deepseek-coder-1.3b-base) |
| 63 | +- `context_length`: Set the context length for the model (default: 512) |
| 64 | +- `device`: Choose the device to run the model on (default: cpu) |
| 65 | +- `use_flash_attention`: Enable or disable flash attention (default: False) |
| 66 | + |
| 67 | +### Fine-tuning Method Selection |
| 68 | +You can switch between different fine-tuning methods by adjusting the following parameters: |
| 69 | + |
| 70 | +#### Standard Fine-tuning |
| 71 | +Set `model_type: standard` in the configuration. |
| 72 | + |
| 73 | +#### LoRA (Low-Rank Adaptation) |
| 74 | +Set `model_type: lora` and adjust these parameters: |
| 75 | +- `lora_r`: Rank of the LoRA update matrices (default: 64) |
| 76 | +- `lora_alpha`: LoRA scaling factor (default: 16) |
| 77 | +- `lora_dropout`: Dropout probability for LoRA layers (default: 0.01) |
| 78 | + |
| 79 | +#### QLoRA (Quantized LoRA) |
| 80 | +Set `model_type: qlora` and adjust these parameters: |
| 81 | +- `bits`: Quantization bits (default: 4) |
| 82 | +- `double_quant`: Enable double quantization (default: True) |
| 83 | +- `quant_type`: Quantization data type (default: nf4) |
| 84 | +- `optim`: Optimizer for QLoRA (default: paged_adamw_32bit) |
| 85 | +- `gradient_checkpointing`: Enable gradient checkpointing (default: True) |
| 86 | + |
| 87 | +### Training Configuration |
| 88 | +- `max_gen_length`: Maximum length of generated code (default: 128) |
| 89 | +- `max_revision_steps`: Maximum number of code revision steps (default: 2) |
| 90 | +- `use_ntp` and `use_fim`: Enable/disable specific training techniques |
| 91 | +- `train_on_code`, `train_on_docs`, etc.: Configure what to train on |
| 92 | + |
| 93 | +For a complete list of configurable parameters, refer to the `src/conf/config.yaml` file in the project repository. |
| 94 | + |
| 95 | +--- |
| 96 | + |
| 97 | +## 📄 **Documentation** |
| 98 | + |
| 99 | +Explore the full API documentation by visiting `http://localhost:8000/docs` after starting the server. |
| 100 | + |
| 101 | +--- |
| 102 | + |
| 103 | +## 🤝 **Contributing** |
| 104 | + |
| 105 | +We welcome contributions! Please check out our [Contributing Guide](CONTRIBUTING.md) for details. |
| 106 | + |
| 107 | +--- |
| 108 | + |
| 109 | +## 📝 **License** |
| 110 | + |
| 111 | +This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for more information. |
| 112 | + |
0 commit comments