Skip to content

Verdiff: A framework for detecting vulnerabilities across software versions by generating data-flow signatures from PoCs and matching them against source code.

Notifications You must be signed in to change notification settings

mdsakibanwar/verdiff

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🛡️ **VerDiff**

Efficient Analysis for Detecting Affected Range of Versions for Vulnerabilities
Early Warning System for Vulnerabilities

Docker Pulls License Issues Stars


Verdiff is a framework for analyzing vulnerabilities across program versions. It generates vulnerability signatures from program execution traces (via dynamic analysis) and checks for their presence in different versions of a program’s source code.

The project is containerized with Docker but can also be built and run locally.


🚀 Quickstart

Run Verdiff in three simple steps using Docker:

docker pull sakibanwar/verdiff
docker run -it sakibanwar/verdiff
/root/claims/run.sh

This will automatically execute all CVEs.


📂 Repository Structure

.
├── Dockerfile
├── install.sh
├── artifact/
│   ├── CWEs/
│   │   ├── CWE-119/
│   │   │   └── CVE-2017-14261_bento4/
│   │   │       ├── info.json
│   │   │       ├── *poc*
│   │   │       ├── run.sh
│   │   │       ├── README.md
│   │   │       └── run_verdiff.sh
│   │   ├── CWE-125/
│   │   ├── CWE-190/
│   │   ├── CWE-476/
│   │   └── CWE-787/
│   ├── data/        # Source code for bento4, jasper, zziplib (various versions)
│   └── src/         # Verdiff source code
├── claims/
│   ├── run.sh       # Runs all CVEs in artifact
│   └── claims1/     # Expected results

⚙️ Installation

You have two ways to set up Verdiff:

1. Using Docker (recommended)

docker pull sakibanwar/verdiff

2. Local Docker Image Build

# Build the Docker image
./install.sh

# Run container
docker run -it verdiff

This script installs dependencies and builds the docker image locally.


🗂️ Artifacts

The artifact/ directory contains vulnerability case studies organized by CWE and CVE.

Each CVE directory contains:

  • info.json – Metadata (program name, vulnerable version, source code location, etc.)
  • poc – Proof-of-concept input that triggers the vulnerability.
  • run.sh – Script to analyze vulnerability behavior (two modes: data flow and non-data flow).
  • README.md – Description of the vulnerability, affected versions, and references.
  • run_verdiff.sh – Script to process data flow log, generate a vulnerability signature and match it against all versions provided in source code location as mentioned in info.json.

🐞 CVEs

VerDiff is a generalized tool and a subset of the dataset is presented here for quick start. As can be noticed, the CVEs spread across 3 projects and 5 different CWEs.

CWE Meaning CVEs in Repository
CWE-119 Improper Restriction of Operations within the Bounds of a Memory Buffer CVE-2017-14261 (bento4)
CWE-125 Out-of-bounds Read CVE-2017-5978 (zziplib)
CWE-190 Integer Overflow or Wraparound CVE-2016-10251 (jasper)
CWE-476 NULL Pointer Dereference CVE-2017-14640 (bento4)
CWE-787 Out-of-bounds Write CVE-2017-14644 (bento4)

🔍 Modes of run.sh

  • Data Flow Mode

    • Compiles the vulnerable version (from info.json if not specified).
    • Executes PoC under Valgrind + Taintgrind.
    • Captures execution logs (data flow of the vulnerability).
    • MemCheck output is stored in program_mem.log and TaintGrind output is captured in program.log under CVE directory.
  • Non Data Flow Mode

    • Compiles the program with sanitizers.
    • Runs program with PoC.
    • Stores observed behavior in result_version.
    • Useful for establishing ground truth.

run_verdiff.sh

  • Processes recorded data flow logs.
  • Generates a signature of the vulnerability.
  • Analyzes the source code of all versions provided for presence of the vulnerability.
  • Outputs a CSV under the directory marking each version vulnerable or non vulnerable.

🧩 Data Directory

artifact/data/ contains source code in tar format of:

  • bento4
  • jasper
  • zziplib

The script claims/run.sh automatically extracts the tars into predefined folders for analysis.


🛠️ Verdiff Source Code

The core implementation of Verdiff is in:

artifact/src/

This is where the analysis logic and signature matching are implemented.


✅ Claims

The claims/ directory contains scripts for running all CVEs in artifact/:

  • claims/run.sh – Runs every CVE experiment.
  • claims/claims1/ – Contains expected results for validation and comparison.

🚀 Workflow Summary

  1. Select a CVE case study from artifact/CWEs/.../CVE-*.
  2. Run run.sh in data flow mode to collect logs of the vulnerable execution.
  3. Run run_verdiff.sh to run Verdiff for that CVE.
  4. Verdiff scans different program versions (artifact/data/) to detect signature presence and outputs the final result in a CSV format in the CVE directory.
  5. Match the result with expected result under claims/claim1
  6. Use claims/run.sh to automate experiments across all CVEs and compare results against claims1/.

📌 Example Usage

# Step 1: Run Analysis either for all or choose from option
/root/claims/run.sh

# Step 2: Run non-dataflow mode to establish ground truth for version-xxx of targeted CVE
/root/artifact/CWE-*/CVE-*/run.sh false version-xxx

🧾 Notes

  • Valgrind + Taintgrind are required for data flow tracing.
  • Results are logged in the CVE directory.
  • Source code for programs under test is in artifact/data/.
  • Expected outputs for batch experiments are in claims/claims1/.

About

Verdiff: A framework for detecting vulnerabilities across software versions by generating data-flow signatures from PoCs and matching them against source code.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published