Efficient Analysis for Detecting Affected Range of Versions for Vulnerabilities
Early Warning System for Vulnerabilities
Verdiff is a framework for analyzing vulnerabilities across program versions. It generates vulnerability signatures from program execution traces (via dynamic analysis) and checks for their presence in different versions of a program’s source code.
The project is containerized with Docker but can also be built and run locally.
Run Verdiff in three simple steps using Docker:
docker pull sakibanwar/verdiff
docker run -it sakibanwar/verdiff
/root/claims/run.sh
This will automatically execute all CVEs.
.
├── Dockerfile
├── install.sh
├── artifact/
│ ├── CWEs/
│ │ ├── CWE-119/
│ │ │ └── CVE-2017-14261_bento4/
│ │ │ ├── info.json
│ │ │ ├── *poc*
│ │ │ ├── run.sh
│ │ │ ├── README.md
│ │ │ └── run_verdiff.sh
│ │ ├── CWE-125/
│ │ ├── CWE-190/
│ │ ├── CWE-476/
│ │ └── CWE-787/
│ ├── data/ # Source code for bento4, jasper, zziplib (various versions)
│ └── src/ # Verdiff source code
├── claims/
│ ├── run.sh # Runs all CVEs in artifact
│ └── claims1/ # Expected results
You have two ways to set up Verdiff:
docker pull sakibanwar/verdiff
# Build the Docker image
./install.sh
# Run container
docker run -it verdiff
This script installs dependencies and builds the docker image locally.
The artifact/
directory contains vulnerability case studies organized by CWE and CVE.
Each CVE directory contains:
info.json
– Metadata (program name, vulnerable version, source code location, etc.)poc
– Proof-of-concept input that triggers the vulnerability.run.sh
– Script to analyze vulnerability behavior (two modes: data flow and non-data flow).README.md
– Description of the vulnerability, affected versions, and references.run_verdiff.sh
– Script to process data flow log, generate a vulnerability signature and match it against all versions provided in source code location as mentioned in info.json.
VerDiff is a generalized tool and a subset of the dataset is presented here for quick start. As can be noticed, the CVEs spread across 3 projects and 5 different CWEs.
CWE | Meaning | CVEs in Repository |
---|---|---|
CWE-119 | Improper Restriction of Operations within the Bounds of a Memory Buffer | CVE-2017-14261 (bento4) |
CWE-125 | Out-of-bounds Read | CVE-2017-5978 (zziplib) |
CWE-190 | Integer Overflow or Wraparound | CVE-2016-10251 (jasper) |
CWE-476 | NULL Pointer Dereference | CVE-2017-14640 (bento4) |
CWE-787 | Out-of-bounds Write | CVE-2017-14644 (bento4) |
-
Data Flow Mode
- Compiles the vulnerable version (from
info.json
if not specified). - Executes PoC under Valgrind + Taintgrind.
- Captures execution logs (data flow of the vulnerability).
- MemCheck output is stored in program_mem.log and TaintGrind output is captured in program.log under CVE directory.
- Compiles the vulnerable version (from
-
Non Data Flow Mode
- Compiles the program with sanitizers.
- Runs program with PoC.
- Stores observed behavior in
result_version
. - Useful for establishing ground truth.
- Processes recorded data flow logs.
- Generates a signature of the vulnerability.
- Analyzes the source code of all versions provided for presence of the vulnerability.
- Outputs a CSV under the directory marking each version vulnerable or non vulnerable.
artifact/data/
contains source code in tar format of:
- bento4
- jasper
- zziplib
The script claims/run.sh automatically extracts the tars into predefined folders for analysis.
The core implementation of Verdiff is in:
artifact/src/
This is where the analysis logic and signature matching are implemented.
The claims/
directory contains scripts for running all CVEs in artifact/
:
claims/run.sh
– Runs every CVE experiment.claims/claims1/
– Contains expected results for validation and comparison.
- Select a CVE case study from
artifact/CWEs/.../CVE-*
. - Run
run.sh
in data flow mode to collect logs of the vulnerable execution. - Run
run_verdiff.sh
to run Verdiff for that CVE. - Verdiff scans different program versions (
artifact/data/
) to detect signature presence and outputs the final result in a CSV format in the CVE directory. - Match the result with expected result under claims/claim1
- Use
claims/run.sh
to automate experiments across all CVEs and compare results againstclaims1/
.
# Step 1: Run Analysis either for all or choose from option
/root/claims/run.sh
# Step 2: Run non-dataflow mode to establish ground truth for version-xxx of targeted CVE
/root/artifact/CWE-*/CVE-*/run.sh false version-xxx
- Valgrind + Taintgrind are required for data flow tracing.
- Results are logged in the CVE directory.
- Source code for programs under test is in
artifact/data/
. - Expected outputs for batch experiments are in
claims/claims1/
.