Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
adc642b
chasing down lints and mypy things
yaleman Nov 28, 2023
bee83d8
updating run docker script to allow you to specify an image ID
yaleman Nov 28, 2023
aca7dba
updating script
yaleman Nov 28, 2023
dc79a94
Fixing show_settings
yaleman Nov 28, 2023
13d868a
keeping nltk cache dir
yaleman Nov 28, 2023
84917d2
fixing run script
yaleman Nov 28, 2023
3a39733
fixing run script
yaleman Nov 28, 2023
170993a
fixing run script again
yaleman Nov 28, 2023
039d53f
one day I will learn to docker
yaleman Nov 28, 2023
e0a7ef2
Update server.conf
davisshannon Nov 29, 2023
7c3ea8b
keep on keeping on
yaleman Nov 29, 2023
a55905e
I think this works now
yaleman Nov 29, 2023
928df76
making mypy happier
yaleman Nov 29, 2023
24b27f0
updating docker script
yaleman Nov 29, 2023
8eacd53
more logs more tests more everything, less broken
yaleman Nov 29, 2023
65016ec
renaming placeholders
yaleman Nov 29, 2023
d46507d
docs updates, more config handling
yaleman Nov 30, 2023
b19de67
more errors more handling
yaleman Nov 30, 2023
9f6b260
exposing the tail of the openai key, adding some tests
yaleman Nov 30, 2023
7a29061
docs tweaks
yaleman Dec 1, 2023
69c92db
docs tweaks
yaleman Dec 1, 2023
49ca770
docs tweaks
yaleman Dec 1, 2023
79336b1
logs cleanup, input handling with pydantic
yaleman Dec 1, 2023
975351a
canary testing livetest script
yaleman Dec 1, 2023
4bf77f6
init embedded first because it's faster to fail
yaleman Dec 1, 2023
b1ca508
... all the changes
yaleman Dec 4, 2023
33fbc7b
yak shaving
yaleman Dec 4, 2023
2bb67c7
more yak shaving and testing
yaleman Dec 5, 2023
3ca8ed7
renaming vigil{-,_}server.py
yaleman Dec 5, 2023
e5936c8
missed a lint
yaleman Dec 5, 2023
e9b9df2
mypy shaving
yaleman Dec 5, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
.github
.git
.venv
35 changes: 35 additions & 0 deletions .github/workflows/build_container.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
---
name: 'Build container'
"on":
pull_request:
push:
branches:
- main
permissions:
packages: write
contents: read
jobs:
docker:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Set up QEMU
uses: docker/setup-qemu-action@v3
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Login to GitHub Container Registry
uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.repository_owner }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Build and push
id: docker_build
uses: docker/build-push-action@v5
with:
push: ${{ github.ref == 'refs/heads/main' }}
platforms: linux/amd64,linux/arm64
tags: ghcr.io/${{ github.repository }}:latest
- name: Image digest
run: echo ${{ steps.docker_build.outputs.digest }}
27 changes: 27 additions & 0 deletions .github/workflows/pylint.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
---
name: Python linting

"on":
push:
branches:
- main # Set a branch to deploy
pull_request:

jobs:
mypy:
runs-on: ubuntu-latest
steps:
- uses: actions/[email protected]
with:
fetch-depth: 0 # Fetch all history for .GitInfo and .Lastmod
- name: Set up Python 3.10
uses: actions/setup-python@v4
with:
python-version: '3.10'
- name: Running ruff
run: |
pip install --no-cache-dir torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu
pip install -r requirements.txt -r requirements-dev.txt
pip install .
ruff tests vigil *.py
mypy tests vigil *.py
15 changes: 15 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ dist/
downloads/
eggs/
.eggs/
.ruff_cache/
lib/
lib64/
parts/
Expand Down Expand Up @@ -158,3 +159,17 @@ cython_debug/
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/

# nltk models
data/nltk/*
data/torch-cache/*
data/huggingface/*
data/vdb/*

#config files
.dockerenv
conf/*.conf

# macOS
.DS_Store
.envrc
16 changes: 11 additions & 5 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
FROM python:3.10-slim
FROM python:3.10-slim as builder
# this is broken up into two stages because when you're rebuilding it you don't want to have to rebuild the whole thing

# Set the working directory in the container
WORKDIR /app
Expand Down Expand Up @@ -33,17 +34,22 @@ RUN echo "Installing YARA from source ..." \
&& make install \
&& make check

RUN echo "Installing pytorch deps" && \
pip install --no-cache-dir torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu

FROM builder as vigil
# Copy vigil into the container
COPY . .

# Install Python dependencies including PyTorch CPU
RUN echo "Installing Python dependencies ... " \
&& pip install --no-cache-dir -r requirements.txt \
&& pip install --no-cache-dir torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu

&& pip install --no-cache-dir -r requirements.txt -r requirements-dev.txt \
&& pip install -e .
# Expose port 5000 for the API server
EXPOSE 5000

ENV VIGIL_CONFIG=/app/conf/docker.conf

COPY scripts/entrypoint.sh /entrypoint.sh
RUN chmod +x /entrypoint.sh
ENTRYPOINT ["/entrypoint.sh", "python", "vigil-server.py", "-c", "conf/server.conf"]
ENTRYPOINT ["/entrypoint.sh", "python", "vigil_server.py"]
76 changes: 48 additions & 28 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,12 @@
![logo](docs/assets/logo.png)
# ![logo](docs/assets/logo.png)

## Overview 🏕️

⚡ Security scanner for LLM prompts ⚡

`Vigil` is a Python library and REST API for assessing Large Language Model prompts and responses against a set of scanners to detect prompt injections, jailbreaks, and other potential threats. This repository also provides the detection signatures and datasets needed to get started with self-hosting.

This application is currently in an **alpha** state and should be considered experimental / for research purposes.
This application is currently in an **alpha** state and should be considered experimental / for research purposes.

* **[Full documentation](https://vigil.deadbits.ai)**
* **[Release Blog](https://vigil.deadbits.ai/overview/background)**
Expand All @@ -17,15 +18,15 @@ This application is currently in an **alpha** state and should be considered exp
* Scanners are modular and easily extensible
* Evaluate detections and pipelines with **Vigil-Eval** (coming soon)
* Available scan modules
* [x] Vector database / text similarity
* [Auto-updating on detected prompts](https://vigil.deadbits.ai/overview/use-vigil/auto-updating-vector-database)
* [x] Heuristics via [YARA](https://virustotal.github.io/yara)
* [x] Transformer model
* [x] Prompt-response similarity
* [x] Canary Tokens
* [x] Sentiment analysis
* [ ] Relevance (via [LiteLLM](https://docs.litellm.ai/docs/))
* [ ] Paraphrasing
* [x] Vector database / text similarity
* [Auto-updating on detected prompts](https://vigil.deadbits.ai/overview/use-vigil/auto-updating-vector-database)
* [x] Heuristics via [YARA](https://virustotal.github.io/yara)
* [x] Transformer model
* [x] Prompt-response similarity
* [x] Canary Tokens
* [x] Sentiment analysis
* [ ] Relevance (via [LiteLLM](https://docs.litellm.ai/docs/))
* [ ] Paraphrasing
* Supports [local embeddings](https://www.sbert.net/) and/or [OpenAI](https://platform.openai.com/)
* Signatures and embeddings for common attacks
* Custom detections via YARA signatures
Expand All @@ -34,16 +35,17 @@ This application is currently in an **alpha** state and should be considered exp
## Background 🏗️

> Prompt Injection Vulnerability occurs when an attacker manipulates a large language model (LLM) through crafted inputs, causing the LLM to unknowingly execute the attacker's intentions. This can be done directly by "jailbreaking" the system prompt or indirectly through manipulated external inputs, potentially leading to data exfiltration, social engineering, and other issues.
- [LLM01 - OWASP Top 10 for LLM Applications v1.0.1 | OWASP.org](https://owasp.org/www-project-top-10-for-large-language-model-applications/assets/PDF/OWASP-Top-10-for-LLMs-2023-v1_0_1.pdf)

These issues are caused by the nature of LLMs themselves, which do not currently separate instructions and data. Although prompt injection attacks are currently unsolvable and there is no defense that will work 100% of the time, by using a layered approach of detecting known techniques you can at least defend against the more common / documented attacks.
[LLM01 - OWASP Top 10 for LLM Applications v1.0.1 | OWASP.org](https://owasp.org/www-project-top-10-for-large-language-model-applications/assets/PDF/OWASP-Top-10-for-LLMs-2023-v1_0_1.pdf)

These issues are caused by the nature of LLMs themselves, which do not currently separate instructions and data. Although prompt injection attacks are currently unsolvable and there is no defense that will work 100% of the time, by using a layered approach of detecting known techniques you can at least defend against the more common / documented attacks.

`Vigil`, or a system like it, should not be your only defense - always implement proper security controls and mitigations.

> [!NOTE]
> Keep in mind, LLMs are not yet widely adopted and integrated with other applications, therefore threat actors have less motivation to find new or novel attack vectors. Stay informed on current attacks and adjust your defenses accordingly!

**Additional Resources**
### Additional Resources

For more information on prompt injection, I recommend the following resources and following the research being performed by people like [Kai Greshake](https://kai-greshake.de/), [Simon Willison](https://simonwillison.net/search/?q=prompt+injection&tag=promptinjection), and others.

Expand All @@ -58,31 +60,38 @@ Follow the steps below to install Vigil
A [Docker container](docs/docker.md) is also available, but this is not currently recommended.

### Clone Repository

Clone the repository or [grab the latest release](https://github.com/deadbits/vigil-llm/releases)
```

```shell
git clone https://github.com/deadbits/vigil-llm.git
cd vigil-llm
```

### Install YARA

Follow the instructions on the [YARA Getting Started documentation](https://yara.readthedocs.io/en/stable/gettingstarted.html) to download and install [YARA v4.3.2](https://github.com/VirusTotal/yara/releases).

### Setup Virtual Environment
```

```shell
python3 -m venv venv
source venv/bin/activate
```

### Install Vigil library

Inside your virutal environment, install the application:
```

```shell
pip install -e .
```

### Configure Vigil

Open the `conf/server.conf` file in your favorite text editor:

```bash
```shell
vim conf/server.conf
```

Expand All @@ -92,11 +101,12 @@ For more information on modifying the `server.conf` file, please review the [Con
> Your VectorDB scanner embedding model setting must match the model used to generate the embeddings loaded into the database, or similarity search will not work.

### Load Datasets

Load the appropriate [datasets](https://vigil.deadbits.ai/overview/use-vigil/load-datasets) for your embedding model with the `loader.py` utility. If you don't intend on using the vector db scanner, you can skip this step.

```bash
python loader.py --conf conf/server.conf --dataset deadbits/vigil-instruction-bypass-ada-002
python loader.py --conf conf/server.conf --dataset deadbits/vigil-jailbreak-ada-002
python loader.py --config conf/server.conf --dataset deadbits/vigil-instruction-bypass-ada-002
python loader.py --config conf/server.conf --dataset deadbits/vigil-jailbreak-ada-002
```

You can load your own datasets as long as you use the columns:
Expand All @@ -116,7 +126,7 @@ Vigil can run as a REST API server or be imported directly into your Python appl
To start the Vigil API server, run the following command:

```bash
python vigil-server.py --conf conf/server.conf
python vigil_server.py --conf conf/server.conf
```

* [API Documentation](https://github.com/deadbits/vigil-llm#api-endpoints-)
Expand Down Expand Up @@ -155,9 +165,11 @@ result = app.canary_tokens.check(prompt=llm_response)
```

## Detection Methods 🔍

Submitted prompts are analyzed by the configured `scanners`; each of which can contribute to the final detection.

**Available scanners:**
### Available scanners

* Vector database
* YARA / heuristics
* Transformer model
Expand All @@ -167,33 +179,37 @@ Submitted prompts are analyzed by the configured `scanners`; each of which can c
For more information on how each works, refer to the [detections documentation](docs/detections.md).

### Canary Tokens

Canary tokens are available through a dedicated class / API.

You can use these in two different detection workflows:

* Prompt leakage
* Goal hijacking

Refer to the [docs/canarytokens.md](docs/canarytokens.md) file for more information.

## API Endpoints 🌐

**POST /analyze/prompt**
### POST /analyze/prompt

Post text data to this endpoint for analysis.

**arguments:**

* **prompt**: str: text prompt to analyze

```bash
curl -X POST -H "Content-Type: application/json" \
-d '{"prompt":"Your prompt here"}' http://localhost:5000/analyze
```

**POST /analyze/response**
### POST /analyze/response

Post text data to this endpoint for analysis.

**arguments:**

* **prompt**: str: text prompt to analyze
* **response**: str: prompt response to analyze

Expand All @@ -202,11 +218,12 @@ curl -X POST -H "Content-Type: application/json" \
-d '{"prompt":"Your prompt here", "response": "foo"}' http://localhost:5000/analyze
```

**POST /canary/add**
### POST /canary/add

Add a canary token to a prompt

**arguments:**

* **prompt**: str: prompt to add canary to
* **always**: bool: add prefix to always include canary in LLM response (optional)
* **length**: str: canary token length (optional, default 16)
Expand All @@ -221,11 +238,12 @@ curl -X POST "http://127.0.0.1:5000/canary/add" \
}'
```

**POST /canary/check**
### POST /canary/check

Check if an output contains a canary token

**arguments:**

* **prompt**: str: prompt to check for canary

```bash
Expand All @@ -236,12 +254,13 @@ curl -X POST "http://127.0.0.1:5000/canary/check" \
}'
```

**POST /add/texts**
### POST /add/texts

Add new texts to the vector database and return doc IDs
Text will be embedded at index time.

**arguments:**

* **texts**: str: list of texts
* **metadatas**: str: list of metadatas

Expand All @@ -257,7 +276,7 @@ curl -X POST "http://127.0.0.1:5000/add/texts" \
}'
```

**GET /settings**
### GET /settings

View current application settings

Expand All @@ -268,6 +287,7 @@ curl http://localhost:5000/settings
## Sample scan output 📌

**Example scan output:**

```json
{
"status": "success",
Expand Down
3 changes: 2 additions & 1 deletion conf/docker.conf
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,13 @@ cache_max = 500

[embedding]
model = openai
openai_key = sk-5XXXXX
openai_key =

[vectordb]
collection = data-openai
db_dir = /app/data/vdb
n_results = 5
model = openai

[auto_update]
enabled = true
Expand Down
4 changes: 2 additions & 2 deletions conf/server.conf
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,11 @@ cache_max = 500

[embedding]
model = openai
openai_key = sk-XXXXX
openai_key =

[vectordb]
collection = data-openai
db_dir = /home/vigil/vigil-llm/data/vdb
db_dir = /tmp/vigil-llm/data/vdb
n_results = 5

[auto_update]
Expand Down
Empty file added data/nltk/.placeholder
Empty file.
Empty file added data/torch-cache/.placeholder
Empty file.
Loading