Skip to content

Commit afe9ca2

Browse files
committed
Merge remote-tracking branch 'origin/main'
2 parents dbe5e19 + 871f7b3 commit afe9ca2

File tree

5 files changed

+1616
-15
lines changed

5 files changed

+1616
-15
lines changed

README.md

Lines changed: 36 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,52 +1,73 @@
11
# gptify
22

3-
`gptify` is a command-line tool that transforms a Git repository into a single text file suitable for use with Large Language Models (LLMs) like ChatGPT. It preserves the file structure and content, enabling LLMs to understand and process the codebase for tasks such as code review, documentation generation, and answering questions about the code. This project is a fork of [gptrepo](https://github.com/zackess/gptrepo) with added features specifically designed for the [miniogre devtool](https://github.com/ogre-run/miniogre).
3+
`gptify` is a command-line tool that transforms a Git repository into a single text file or multiple text chunks suitable for use with Large Language Models (LLMs) like ChatGPT. It preserves the file structure and content, enabling LLMs to understand and process the codebase for tasks such as code review, documentation generation, and answering questions about the code. This project is a fork of [gptrepo](https://github.com/zackess/gptrepo) with added features.
44

55
## Relevance
66

7-
This tool addresses the challenge of effectively using LLMs with codebases. By converting a repository into a digestible format, `gptify` allows developers to leverage the power of LLMs for various development tasks. Within the miniogre project, it plays a crucial role in facilitating AI-driven code understanding and interaction.
7+
This tool addresses the challenge of effectively using LLMs with codebases. By converting a repository into a digestible format, `gptify` allows developers to leverage the power of LLMs for various development tasks. It simplifies the process of feeding code context into LLMs, avoiding size limitations and formatting issues.
88

99
## Installation
1010

11-
The easiest way
12-
`pip install gptify`.
11+
The easiest way to install `gptify` is using `pip`:
1312

14-
`gptify` can also be installed using `pipx`:
13+
```bash
14+
pip install gptify
15+
```
16+
17+
Alternatively, you can install it using `pipx`:
1518

1619
```bash
1720
poetry build && pipx install dist/*.whl
1821
```
19-
You can also uninstall older versions using the provided install script: `./install.sh`.
22+
23+
You can also uninstall older versions using the provided install script:
24+
25+
```bash
26+
./install.sh
27+
```
2028

2129
## Usage
2230

23-
After installation, navigate to the root directory of your Git repository and run:
31+
1. **Navigate to the root directory** of your Git repository.
32+
2. **Run the `gptify` command**:
2433

2534
```bash
2635
gptify
2736
```
2837

29-
This command will generate a file named `gptify_output.txt` in the current directory containing the formatted repository content. You can then copy and paste the contents of this file into a ChatGPT session to interact with your codebase.
38+
This will generate a file named `gptify_output.txt` in the current directory containing the formatted repository content. You can then copy and paste the contents of this file into a ChatGPT session.
39+
3040

3141
### Options
3242

3343
* `--output <filename>`: Specifies the name of the output file (default: `gptify_output.txt`).
34-
* `--clipboard`: Copies the output directly to the clipboard, omitting the output file creation.
44+
* `--clipboard`: Copies the output directly to the clipboard instead of creating an output file.
3545
* `--openfile`: Opens the output file after creation using the default system application.
36-
* `--preamble <filepath>`: Prepends a custom preamble to the output file.
46+
* `--preamble <filepath>`: Prepends a custom preamble to the output file. This is useful for providing instructions or context to the LLM.
47+
* `--chunk`: Enables chunking of the output into smaller files, useful for handling large repositories that exceed LLM context limits. Used with `--max_tokens` and `--overlap`.
48+
* `--max_tokens`: Sets the maximum number of tokens per chunk when using the `--chunk` option (default: 900000). Requires the `tiktoken` library.
49+
* `--overlap`: Sets the number of overlapping tokens between chunks when using the `--chunk` option (default: 400). Helps maintain context across chunks. Requires the `tiktoken` library.
50+
* `--output_dir`: Specifies the output directory for chunks when using `--chunk` (default: `gptify_output_chunks`).
51+
3752

38-
## Example with custom output file:
53+
## Example with custom output file and preamble:
3954

4055
```bash
41-
gptify --output my_repo.txt
56+
gptify --output my_repo.txt --preamble instructions.txt
4257
```
4358

44-
This will generate `my_repo.txt` with the processed repository data.
59+
This command will generate `my_repo.txt` with the processed repository data, prepended with the content of `instructions.txt`.
4560

46-
## Contributing
61+
## Example with chunking:
4762

48-
While contributions are welcome, the focus of this fork is on specific features for miniogre, and responses to pull requests might be delayed.
63+
```bash
64+
gptify --chunk --max_tokens 4000 --overlap 200
65+
```
66+
This will create multiple files in the `gptify_output_chunks` directory, each containing a chunk of the repository data, with a maximum of 4000 tokens and an overlap of 200 tokens.
67+
68+
## Contributing
4969

70+
Contributions are welcome.
5071

5172
## License
5273

ogre_dir/Dockerfile

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
FROM ogrerun/base:ubuntu22.04-x86_64
2+
ENV TZ=Etc/UTC
3+
RUN ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > /etc/timezone
4+
WORKDIR /opt/gptify
5+
COPY . .
6+
RUN cp ./ogre_dir/bashrc /etc/bash.bashrc
7+
RUN chmod a+rwx /etc/bash.bashrc
8+
RUN pip install uv pip-licenses cyclonedx-bom
9+
RUN cat ./ogre_dir/requirements.txt | xargs -L 1 uv pip install --system; exit 0

ogre_dir/README.md

Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,74 @@
1+
# gptify
2+
3+
`gptify` is a command-line tool that transforms a Git repository into a single text file or multiple text chunks suitable for use with Large Language Models (LLMs) like ChatGPT. It preserves the file structure and content, enabling LLMs to understand and process the codebase for tasks such as code review, documentation generation, and answering questions about the code. This project is a fork of [gptrepo](https://github.com/zackess/gptrepo) with added features.
4+
5+
## Relevance
6+
7+
This tool addresses the challenge of effectively using LLMs with codebases. By converting a repository into a digestible format, `gptify` allows developers to leverage the power of LLMs for various development tasks. It simplifies the process of feeding code context into LLMs, avoiding size limitations and formatting issues.
8+
9+
## Installation
10+
11+
The easiest way to install `gptify` is using `pip`:
12+
13+
```bash
14+
pip install gptify
15+
```
16+
17+
Alternatively, you can install it using `pipx`:
18+
19+
```bash
20+
poetry build && pipx install dist/*.whl
21+
```
22+
23+
You can also uninstall older versions using the provided install script:
24+
25+
```bash
26+
./install.sh
27+
```
28+
29+
## Usage
30+
31+
1. **Navigate to the root directory** of your Git repository.
32+
2. **Run the `gptify` command**:
33+
34+
```bash
35+
gptify
36+
```
37+
38+
This will generate a file named `gptify_output.txt` in the current directory containing the formatted repository content. You can then copy and paste the contents of this file into a ChatGPT session.
39+
40+
41+
### Options
42+
43+
* `--output <filename>`: Specifies the name of the output file (default: `gptify_output.txt`).
44+
* `--clipboard`: Copies the output directly to the clipboard instead of creating an output file.
45+
* `--openfile`: Opens the output file after creation using the default system application.
46+
* `--preamble <filepath>`: Prepends a custom preamble to the output file. This is useful for providing instructions or context to the LLM.
47+
* `--chunk`: Enables chunking of the output into smaller files, useful for handling large repositories that exceed LLM context limits. Used with `--max_tokens` and `--overlap`.
48+
* `--max_tokens`: Sets the maximum number of tokens per chunk when using the `--chunk` option (default: 900000). Requires the `tiktoken` library.
49+
* `--overlap`: Sets the number of overlapping tokens between chunks when using the `--chunk` option (default: 400). Helps maintain context across chunks. Requires the `tiktoken` library.
50+
* `--output_dir`: Specifies the output directory for chunks when using `--chunk` (default: `gptify_output_chunks`).
51+
52+
53+
## Example with custom output file and preamble:
54+
55+
```bash
56+
gptify --output my_repo.txt --preamble instructions.txt
57+
```
58+
59+
This command will generate `my_repo.txt` with the processed repository data, prepended with the content of `instructions.txt`.
60+
61+
## Example with chunking:
62+
63+
```bash
64+
gptify --chunk --max_tokens 4000 --overlap 200
65+
```
66+
This will create multiple files in the `gptify_output_chunks` directory, each containing a chunk of the repository data, with a maximum of 4000 tokens and an overlap of 200 tokens.
67+
68+
## Contributing
69+
70+
Contributions are welcome.
71+
72+
## License
73+
74+
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

ogre_dir/requirements.txt

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
pyperclip==1.9.0
2+
certifi==2024.8.30
3+
charset-normalizer==3.4.0
4+
idna==3.10
5+
regex==2024.11.6
6+
requests==2.32.3
7+
tiktoken==0.8.0
8+
urllib3==2.2.3
9+
gptify==0.3.6
10+
pyperclip==1.9.0

ogre_dir/sbom.json

Lines changed: 1487 additions & 0 deletions
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)